具有中高速网络带宽的大型数据集的数据传输Data transfer for large datasets with moderate to high network bandwidth
本文概述了在环境中有中高速网络带宽并且正在计划传输大型数据集时的数据传输解决方案。This article provides an overview of the data transfer solutions when you have moderate to high network bandwidth in your environment and you are planning to transfer large datasets. 本文还介绍了针对此情况的推荐数据传输选项和相应的关键功能矩阵。The article also describes the recommended data transfer options and the respective key capability matrix for this scenario.
若要查看所有可用数据传输选项的概述,请转到选择一个 Azure 数据传输解决方案。To understand an overview of all the available data transfer options, go to Choose an Azure data transfer solution.
方案描述Scenario description
大型数据集指的是 TB 到 PB 级的数据大小。Large datasets refer to data sizes in the order of TBs to PBs. 中高速网络带宽是指 100 Mbps 到 10 Gbps 的网络带宽。Moderate to high network bandwidth refers to 100 Mbps to 10 Gbps.
推荐选项Recommended options
此方案中推荐的选项取决于是否具有中速网络带宽或高速网络带宽。The options recommended in this scenario depend on whether you have moderate network bandwidth or high network bandwidth.
中速网络带宽 (100 Mbps - 1 Gbps)Moderate network bandwidth (100 Mbps - 1 Gbps)
使用中速网络带宽,需要预测通过该网络传输数据的时间。With moderate network bandwidth, you need to project the time for data transfer over the network.
使用下表估计时间,并根据此时间,在离线传输或脱机传输之间进行选择。Use the following table to estimate the time and based on that, choose between an offline transfer or over the network transfer. 下表显示各种可用网络带宽(假设利用率为 90%)的网络数据传输的预测时间。The table shows the projected time for network data transfer, for various available network bandwidths (assuming 90% utilization).
如果预测网络传输速度很慢,应使用物理设备。If the network transfer is projected to be too slow, you should use a physical device. 在这种情况下,推荐的选项是 Azure Data Box 系列的离线传输设备或使用自己的磁盘执行 Azure 导入/导出。The recommended options in this case are the offline transfer devices from Azure Data Box family or Azure Import/Export using your own disks.
- 用于脱机传输的 Azure Data Box 系列 – 当受到时间、网络可用性或成本的限制时,使用 Azure 提供的 Data Box 设备将大量数据移到 Azure。Azure Data Box family for offline transfers – Use devices from Azure-supplied Data Box devices to move large amounts of data to Azure when you're limited by time, network availability, or costs. 使用工具(例如 Robocopy)复制本地数据。Copy on-premises data using tools such as Robocopy. 根据要传输的数据的大小,你可以选择 Data Box Disk。Depending on the data size intended for transfer, you can choose Data Box Disk.
- Azure 导入/导出 - 通过寄送自己的磁盘驱动器,使用 Azure 导入/导出服务安全地将大量数据导入 Azure Blob 存储和 Azure 文件。Azure Import/Export – Use Azure Import/Export service by shipping your own disk drives to securely import large amounts of data to Azure Blob storage and Azure Files. 此外,还可以使用此服务将数据从 Azure Blob 存储传输到磁盘驱动器,然后再寄送到本地站点。This service can also be used to transfer data from Azure Blob storage to disk drives and ship to your on-premises sites.
如果预测出网络传输比较合理,那么可以使用以下在高速网络带宽中详细介绍的工具。If the network transfer is projected to be reasonable, then you can use any of the following tools detailed in High network bandwidth.
高速网络带宽 (1 Gbps - 100 Gbps)High network bandwidth (1 Gbps - 100 Gbps)
如果可用的网络带宽为高速带宽,则使用下面的一种工具。If the available network bandwidth is high, use one of the following tools.
- AzCopy - 使用此命令行工具在保证最佳性能的同时轻松向/从 Azure Blob、文件和表存储复制数据。AzCopy - Use this command-line tool to easily copy data to and from Azure Blobs, Files, and Table storage with optimal performance. AzCopy 支持并发度和并行度,并且可以在复制操作中断后进行恢复。AzCopy supports concurrency and parallelism, and the ability to resume copy operations when interrupted.
- Azure 存储 REST API/SDK - 生成应用程序时,可以对照着 Azure 存储 REST API 开发应用程序,并使用以多种语言提供的 Azure SDK。Azure Storage REST APIs/SDKs – When building an application, you can develop the application against Azure Storage REST APIs and use the Azure SDKs offered in multiple languages.
- Azure 数据工厂 - 如果需要业务流程和企业级监视功能,应使用数据工厂横向扩展传输操作。Azure Data Factory – Data Factory should be used to scale out a transfer operation, and if there is a need for orchestration and enterprise grade monitoring capabilities. 使用数据工厂在多个 Azure 服务、本地或两者的组合之间定期传输文件。Use Data Factory to regularly transfer files between several Azure services, on-premises, or a combination of the two. 使用数据工厂,可以创建和计划数据驱动型工作流(称为管道),以便从不同的数据存储引入数据并自动执行数据移动和数据传输。with Data Factory, you can create and schedule data-driven workflows (called pipelines) that ingest data from disparate data stores and automate data movement and data transformation.
关键功能比较Comparison of key capabilities
下表总结了推荐选项的主要功能差异。The following tables summarize the differences in key capabilities for the recommended options.
中速网络带宽Moderate network bandwidth
如果使用脱机数据传输,请通过下表了解主要功能之间的差异。If using offline data transfer, use the following table to understand the differences in key capabilities.
Data Box DiskData Box Disk | 导入/导出Import/Export | |
---|---|---|
数据大小Data size | 最多为 35 TBUp to 35 TBs | 变量Variable |
Data typeData type | Azure BlobAzure Blobs | Azure BlobAzure Blobs Azure 文件Azure Files |
外形规格Form factor | 每笔订单 5 个 SSD5 SSDs per order | 每笔订单最多 10 个 HDD/SSDUp to 10 HDDs/SSDs per order |
初始设置时间Initial setup time | 低Low (15 分钟)(15 mins) |
中等到困难Moderate to difficult (不定)(variable) |
将数据发送到 AzureSend data to Azure | 是Yes | 是Yes |
从 Azure 导出数据Export data from Azure | 否No | 是Yes |
加密Encryption | AES 128 位AES 128-bit | AES 128 位AES 128-bit |
硬件Hardware | Azure 提供Azure supplied | 客户提供Customer supplied |
网络接口Network interface | USB 3.1/SATAUSB 3.1/SATA | SATA II/SATA IIISATA II/SATA III |
合作伙伴集成Partner integration | 一些Some | 一些Some |
寄送Shipping | Azure 托管Azure managed | 由客户管理Customer managed |
数据移动时使用Use when data moves | 在商务区域内Within a commerce boundary | 跨地理区域,例如美国到欧洲Across geographic boundaries, e.g. US to EU |
定价Pricing | 定价Pricing | 定价Pricing |
如果使用在线数据传输,请使用以下部分中的表格获得高速网络带宽。If using online data transfer, use the table in the following section for high network bandwidth.
高速网络带宽High network bandwidth
Tools AzCopy,Tools AzCopy, Azure PowerShell,Azure PowerShell, Azure CLIAzure CLI |
Azure 存储 REST API,SDKAzure Storage REST APIs, SDKs | Azure 数据工厂Azure Data Factory | |
---|---|---|---|
Data typeData type | Azure Blob、Azure 文件、Azure 表Azure Blobs, Azure Files, Azure Tables | Azure Blob、Azure 文件、Azure 表Azure Blobs, Azure Files, Azure Tables | 支持 70 多个用于数据存储和格式的数据连接器Supports 70+ data connectors for data stores and formats |
外形规格Form factor | 命令行工具Command-line tools | 编程接口Programmatic interface | Azure 门户中的服务Service in Azure portal |
初始一次性设置Initial one-time setup | 简单Easy | 中等Moderate | 广泛Extensive |
数据预处理Data pre-processing | 否No | 否No | 是Yes |
从其他云传输Transfer from other clouds | 否No | 否No | 是Yes |
用户类型User type | IT 专家或开发人员IT Pro or dev | DevDev | IT 专业人员IT Pro |
定价Pricing | 免费,收取数据出口费用Free, data egress charges apply | 免费,收取数据出口费用Free, data egress charges apply | 定价Pricing |
后续步骤Next steps
了解如何使用导入/导出转移数据。Learn how to transfer data with Import/Export.
了解如何Understand how to
了解如何使用 Azure 数据工厂传输数据。Learn how to transfer data with Azure Data Factory.
使用 REST API 传输数据Use the REST APIs to transfer data