Data transfer for large datasets with moderate to high network bandwidth

This article provides an overview of the data transfer solutions when you have moderate to high network bandwidth in your environment and you are planning to transfer large datasets. The article also describes the recommended data transfer options and the respective key capability matrix for this scenario.

To understand an overview of all the available data transfer options, go to Choose an Azure data transfer solution.

Scenario description

Large datasets refer to data sizes in the order of TBs to PBs. Moderate to high network bandwidth refers to 100 Mbps to 10 Gbps.

The options recommended in this scenario depend on whether you have moderate network bandwidth or high network bandwidth.

Moderate network bandwidth (100 Mbps - 1 Gbps)

With moderate network bandwidth, you need to project the time for data transfer over the network.

Use the following table to estimate the time and based on that, choose between an offline transfer or over the network transfer. The table shows the projected time for network data transfer, for various available network bandwidths (assuming 90% utilization).

Network transfer or offline transfer

  • If the network transfer is projected to be too slow, you should use a physical device. The recommended options in this case are the offline transfer devices from Azure Data Box family or Azure Import/Export using your own disks.

    • Azure Data Box family for offline transfers – Use devices from Azure-supplied Data Box devices to move large amounts of data to Azure when you're limited by time, network availability, or costs. Copy on-premises data using tools such as Robocopy. Depending on the data size intended for transfer, you can choose Data Box Disk.
    • Azure Import/Export – Use Azure Import/Export service by shipping your own disk drives to securely import large amounts of data to Azure Blob storage and Azure Files. This service can also be used to transfer data from Azure Blob storage to disk drives and ship to your on-premises sites.
  • If the network transfer is projected to be reasonable, then you can use any of the following tools detailed in High network bandwidth.

High network bandwidth (1 Gbps - 100 Gbps)

If the available network bandwidth is high, use one of the following tools.

  • AzCopy - Use this command-line tool to easily copy data to and from Azure Blobs, Files, and Table storage with optimal performance. AzCopy supports concurrency and parallelism, and the ability to resume copy operations when interrupted.
  • Azure Storage REST APIs/SDKs – When building an application, you can develop the application against Azure Storage REST APIs and use the Azure SDKs offered in multiple languages.
  • Azure Data Factory – Data Factory should be used to scale out a transfer operation, and if there is a need for orchestration and enterprise grade monitoring capabilities. Use Data Factory to regularly transfer files between several Azure services, on-premises, or a combination of the two. with Data Factory, you can create and schedule data-driven workflows (called pipelines) that ingest data from disparate data stores and automate data movement and data transformation.

Comparison of key capabilities

The following tables summarize the differences in key capabilities for the recommended options.

Moderate network bandwidth

If using offline data transfer, use the following table to understand the differences in key capabilities.

Data Box Disk Import/Export
Data size Up to 35 TBs Variable
Data type Azure Blobs
Azure Files*
Azure Blobs
Azure Files
Form factor 5 SSDs per order Up to 10 HDDs/SSDs per order
Initial setup time Low
(15 mins)
Moderate to difficult
(variable)
Send data to Azure Yes Yes
Export data from Azure No Yes
Encryption AES 128-bit AES 128-bit
Hardware Microsoft supplied Customer supplied
Network interface USB 3.1/SATA SATA II/SATA III
Partner integration Some Some
Shipping Microsoft managed Customer managed
Use when data moves Within a commerce boundary Across geographic boundaries

* Data Box Disk does not support Large File Shares and does not preserve file metadata

If using online data transfer, use the table in the following section for high network bandwidth.

High network bandwidth

Tools AzCopy,
Azure PowerShell,
Azure CLI
Azure Storage REST APIs, SDKs Azure Data Factory
Data type Azure Blobs, Azure Files, Azure Tables Azure Blobs, Azure Files, Azure Tables Supports 70+ data connectors for data stores and formats
Form factor Command-line tools Programmatic interface Service in Azure portal
Initial one-time setup Easy Moderate Extensive
Data pre-processing No No Yes
Transfer from other clouds No No Yes
User type IT Pro or dev Dev IT Pro
Pricing Free, data egress charges apply Free, data egress charges apply Pricing

Next steps