教程:将数据复制到 Azure Data Box Disk 并进行验证Tutorial: Copy data to Azure Data Box Disk and verify

将数据复制到 Azure Data Box Disk 并进行验证Copy data to Azure Data Box Disk and validate

连接并解锁磁盘后,可以将数据从源数据服务器复制到磁盘。After the disks are connected and unlocked, you can copy data from your source data server to your disks. 在数据复制完成后,应该验证数据以确保它将成功上传到 Azure。After the data copy is complete, you should validate the data to ensure that it will successfully upload to Azure.

本教程介绍如何从主机复制数据,然后生成校验和来验证数据完整性。This tutorial describes how to copy data from your host computer and then generate checksums to verify data integrity.

在本教程中,你将了解如何执行以下操作:In this tutorial, you learn how to:

  • 将数据复制到 Data Box 磁盘Copy data to Data Box Disk
  • 验证数据Verify data

先决条件Prerequisites

在开始之前,请确保:Before you begin, make sure that:

将数据复制到磁盘Copy data to disks

在将数据复制到磁盘之前,请查看以下注意事项:Review the following considerations before you copy the data to the disks:

  • 你需要负责确保将数据复制到与适当数据格式对应的文件夹中。It is your responsibility to ensure that you copy the data to folders that correspond to the appropriate data format. 例如,将块 Blob 数据复制到块 Blob 的文件夹。For instance, copy the block blob data to the folder for block blobs. 如果数据格式与相应的文件夹(存储类型)不匹配,则在后续步骤中,数据将无法上传到 Azure。If the data format does not match the appropriate folder (storage type), then at a later step, the data upload to Azure fails.

  • 复制数据时,请确保数据大小符合 Azure 存储和 Data Box 磁盘限制中所述的大小限制。While copying data, ensure that the data size conforms to the size limits described in the Azure storage and Data Box Disk limits.

  • 如果 Data Box 磁盘正在上传的数据同时已由 Data Box 磁盘外部的其他应用程序上传,则可能会导致上传作业失败和数据损坏。If data, which is being uploaded by Data Box Disk, is concurrently uploaded by other applications outside of Data Box Disk, then this could result in upload job failures and data corruption.

    重要

    如果已在创建订单的过程中将托管磁盘指定为存储目标之一,那么以下部分就是适用的。If you specified managed disks as one of the storage destinations during order creation, the following section is applicable.

  • 在所有预先创建的文件夹和所有 Data Box Disk 中,一个资源组只能包含一个具有给定名称的托管磁盘。You can only have one managed disk with a given name in a resource group across all the precreated folders and across all the Data Box Disk. 这意味着,上传到预先创建的文件夹的 VHD 应具有唯一的名称。This implies that the VHDs uploaded to the precreated folders should have unique names. 确保给定的名称与资源组中现有的托管磁盘不匹配。Make sure that the given name does not match an already existing managed disk in a resource group. 如果 VHD 具有相同的名称,则只有一个 VHD 将转换为具有该名称的托管磁盘。If VHDs have same names, then only one VHD is converted to managed disk with that name. 其他 VHD 作为页 blob 上传到临时存储帐户。The other VHDs are uploaded as page blobs into the staging storage account.

  • 始终将 VHD 复制到某个预先创建的文件夹。Always copy the VHDs to one of the precreated folders. 如果将 VHD 复制到这些文件夹以外或者复制到你自己创建的文件夹中,则 VHD 作为页 Blob 而不是托管磁盘上传到 Azure 存储帐户中。If you copy the VHDs outside of these folders or in a folder that you created, the VHDs are uploaded to Azure Storage account as page blobs and not managed disks.

  • 只能上传固定的 VHD 来创建托管磁盘。Only the fixed VHDs can be uploaded to create managed disks. 不支持动态 VHD、差异 VHD 或 VHDX 文件。Dynamic VHDs, differencing VHDs or VHDX files are not supported.

执行以下步骤,连接到计算机并将其上的数据复制到 Data Box 磁盘。Perform the following steps to connect and copy data from your computer to the Data Box Disk.

  1. 查看已解锁的驱动器的内容。View the contents of the unlocked drive. 根据放置 Data Box Disk 顺序时选择的选项,驱动器中预先创建的文件夹和子文件夹的列表会有所不同。The list of the precreated folders and subfolders in the drive is different depending upon the options selected when placing the Data Box Disk order. 如果预创建的文件夹不存在,则不要创建该文件夹,因为复制用户创建的文件夹时将无法上传到 Azure。If a precreated folder does not exist, do not create it as copying to a user created folder will fail to upload on Azure.

    所选的存储目标Selected storage destination 存储帐户类型Storage account type 临时存储帐户类型Staging storage account type 文件夹和子文件夹Folders and sub-folders
    存储帐户Storage account GPv1 或 GPv2GPv1 or GPv2 NANA BlockBlobBlockBlob
    PageBlobPageBlob
    AzureFileAzureFile
    存储帐户Storage account Blob 存储帐户Blob storage account NANA BlockBlobBlockBlob
    托管磁盘Managed disks NANA GPv1 或 GPv2GPv1 or GPv2 ManagedDiskManagedDisk
    • PremiumSSDPremiumSSD
    • StandardSSDStandardSSD
    • StandardHDDStandardHDD
    存储帐户Storage account
    托管磁盘Managed disks
    GPv1 或 GPv2GPv1 or GPv2 GPv1 或 GPv2GPv1 or GPv2 BlockBlobBlockBlob
    PageBlobPageBlob
    AzureFileAzureFile
    ManagedDiskManagedDisk
    • PremiumSSDPremiumSSD
    • StandardSSDStandardSSD
    • StandardHDDStandardHDD
    存储帐户Storage account
    托管磁盘Managed disks
    Blob 存储帐户Blob storage account GPv1 或 GPv2GPv1 or GPv2 BlockBlobBlockBlob
    ManagedDiskManagedDisk
    • PremiumSSDPremiumSSD
    • StandardSSDStandardSSD
    • StandardHDDStandardHDD

    下面显示了指定 GPv2 存储帐户的订单的示例屏幕截图:An example screenshot of an order where a GPv2 storage account was specified is shown below:

    磁盘驱动器的内容

  2. 将需要作为块 Blob 导入的数据复制到 BlockBlob 文件夹中。Copy the data that needs to be imported as block blobs in to BlockBlob folder. 同样,将 VHD/VHDX 等数据复制到 PageBlob 文件夹并将数据复制到 AzureFile 文件夹 。Similarly, copy data such as VHD/VHDX to PageBlob folder and data in to AzureFile folder.

    在 Azure 存储帐户中,为 BlockBlob 和 PageBlob 文件夹下的每个子文件夹创建一个容器。A container is created in the Azure storage account for each subfolder under BlockBlob and PageBlob folders. BlockBlob 和 PageBlob 文件夹下的所有文件将复制到 Azure 存储帐户下的默认容器 $root 中。All files under BlockBlob and PageBlob folders are copied into a default container $root under the Azure Storage account. $root 容器中的所有文件始终作为块 Blob 上传。Any files in the $root container are always uploaded as block blobs.

    将文件复制到“AzureFile”文件夹中的文件夹。Copy files to a folder within AzureFile folder. AzureFile 文件夹中的子文件夹创建文件共享。A sub-folder within AzureFile folder creates a fileshare. 直接复制到 AzureFile 文件夹的文件都会失败,会作为块 Blob 上传。Files copied directly to AzureFile folder fail and are uploaded as block blobs.

    如果根目录中存在文件和文件夹,则必须先将它们移到另一个文件夹,然后开始复制数据。If files and folders exist in the root directory, then you must move those to a different folder before you begin data copy.

    重要

    所有容器、Blob 和文件名都应符合 Azure 命名约定All the containers, blobs, and filenames should conform to Azure naming conventions. 如果不遵循这些规则,则无法将数据上传到 Azure。If these rules are not followed, the data upload to Azure will fail.

  3. 复制文件时,确保块 Blob 的文件不超过大约 4.7 TiB,页 Blob 的文件不超过大约 8 TiB,Azure 文件不超过大约 1 TiB。When copying files, ensure that files do not exceed ~4.7 TiB for block blobs, ~8 TiB for page blobs, and ~1 TiB for Azure Files.

  4. 可以使用文件资源管理器中的拖放操作复制数据。You can use drag and drop with File Explorer to copy the data. 也可以使用与 SMB 兼容的任何文件复制工具(例如 Robocopy)复制数据。You can also use any SMB compatible file copy tool such as Robocopy to copy your data. 可以使用以下 Robocopy 命令启动多个复制作业:Multiple copy jobs can be initiated using the following Robocopy command:

    Robocopy <source> <destination> * /MT:64 /E /R:1 /W:1 /NFL /NDL /FFT /Log:c:\RobocopyLog.txt

    下面以表格形式列出了命令的参数和选项:The parameters and options for the command are tabulated as follows:

    参数/选项Parameters/Options 说明Description
    Source 指定源目录的路径。Specifies the path to the source directory.
    目标Destination 指定目标目录的路径。Specifies the path to the destination directory.
    /E/E 复制包括空目录的子目录。Copies subdirectories including empty directories.
    /MT[:N]/MT[:N] 使用 N 个线程创建多线程副本,其中 N 是介于 1 和 128 之间的整数。Creates multi-threaded copies with N threads where N is an integer between 1 and 128.
    N 的默认值为 8。The default value for N is 8.
    /R: <N>/R: <N> 指定复制失败时的重试次数。Specifies the number of retries on failed copies. N 的默认值为 1,000,000(100 万次重试)。The default value of N is 1,000,000 (one million retries).
    /W: <N>/W: <N> 指定等待重试的间隔时间,以秒为单位。Specifies the wait time between retries, in seconds. N 的默认值为的 30(等待 30 秒)。The default value of N is 30 (wait time 30 seconds).
    /NFL/NFL 指定不记录文件名。Specifies that file names are not to be logged.
    /NDL/NDL 指定不记录目录名。Specifies that directory names are not to be logged.
    /FFT/FFT 采用 FAT 文件时间(精度为两秒)。Assumes FAT file times (two-second precision).
    /Log:<Log File>/Log:<Log File> 将状态输出写入到日志文件(覆盖现有的日志文件)。Writes the status output to the log file (overwrites the existing log file).

    可以配合每个磁盘上运行的多个作业一起使用多个磁盘。Multiple disks can be used in parallel with multiple jobs running on each disk.

  5. 当作业正在进行时检查复制状态。Check the copy status when the job is in progress. 以下示例显示了将文件复制到 Data Box 磁盘的 robocopy 命令的输出。The following sample shows the output of the robocopy command to copy files to the Data Box Disk.

    C:\Users>robocopy
        -------------------------------------------------------------------------------
       ROBOCOPY     ::     Robust File Copy for Windows
    -------------------------------------------------------------------------------
    
      Started : Thursday, March 8, 2018 2:34:53 PM
           Simple Usage :: ROBOCOPY source destination /MIR
    
                 source :: Source Directory (drive:\path or \\server\share\path).
            destination :: Destination Dir  (drive:\path or \\server\share\path).
                   /MIR :: Mirror a complete directory tree.
    
        For more usage information run ROBOCOPY /?    
    
    ****  /MIR can DELETE files as well as copy them !
    
    C:\Users>Robocopy C:\Git\azure-docs-pr\contributor-guide \\10.126.76.172\devicemanagertest1_AzFile\templates /MT:64 /E /R:1 /W:1 /FFT 
    -------------------------------------------------------------------------------
       ROBOCOPY     ::     Robust File Copy for Windows
    -------------------------------------------------------------------------------
    
      Started : Thursday, March 8, 2018 2:34:58 PM
       Source : C:\Git\azure-docs-pr\contributor-guide\
         Dest : \\10.126.76.172\devicemanagertest1_AzFile\templates\
    
        Files : *.*
    
      Options : *.* /DCOPY:DA /COPY:DAT /MT:8 /R:1000000 /W:30
    
    ------------------------------------------------------------------------------
    
    100%        New File                 206        C:\Git\azure-docs-pr\contributor-guide\article-metadata.md
    100%        New File                 209        C:\Git\azure-docs-pr\contributor-guide\content-channel-guidance.md
    100%        New File                 732        C:\Git\azure-docs-pr\contributor-guide\contributor-guide-index.md
    100%        New File                 199        C:\Git\azure-docs-pr\contributor-guide\contributor-guide-pr-criteria.md
                New File                 178        C:\Git\azure-docs-pr\contributor-guide\contributor-guide-pull-request-co100%  .md
                New File                 250        C:\Git\azure-docs-pr\contributor-guide\contributor-guide-pull-request-et100%  e.md
    100%        New File                 174        C:\Git\azure-docs-pr\contributor-guide\create-images-markdown.md
    100%        New File                 197        C:\Git\azure-docs-pr\contributor-guide\create-links-markdown.md
    100%        New File                 184        C:\Git\azure-docs-pr\contributor-guide\create-tables-markdown.md
    100%        New File                 208        C:\Git\azure-docs-pr\contributor-guide\custom-markdown-extensions.md
    100%        New File                 210        C:\Git\azure-docs-pr\contributor-guide\file-names-and-locations.md
    100%        New File                 234        C:\Git\azure-docs-pr\contributor-guide\git-commands-for-master.md
    100%        New File                 186        C:\Git\azure-docs-pr\contributor-guide\release-branches.md
    100%        New File                 240        C:\Git\azure-docs-pr\contributor-guide\retire-or-rename-an-article.md
    100%        New File                 215        C:\Git\azure-docs-pr\contributor-guide\style-and-voice.md
    100%        New File                 212        C:\Git\azure-docs-pr\contributor-guide\syntax-highlighting-markdown.md
    100%        New File                 207        C:\Git\azure-docs-pr\contributor-guide\tools-and-setup.md
    ------------------------------------------------------------------------------
    
                   Total    Copied   Skipped  Mismatch    FAILED    Extras
        Dirs :         1         1         1         0         0         0
       Files :        17        17         0         0         0         0
       Bytes :     3.9 k     3.9 k         0         0         0         0
       Times :   0:00:05   0:00:00                       0:00:00   0:00:00
    
       Speed :                5620 Bytes/sec.
       Speed :               0.321 MegaBytes/min.
       Ended : Thursday, March 8, 2018 2:34:59 PM
    
    C:\Users>
    

    若要优化性能,请在复制数据时使用以下 robocopy 参数。To optimize the performance, use the following robocopy parameters when copying the data.

    平台Platform 大多为小于 512 KB 的小型文件Mostly small files < 512 KB 大多为 512 KB-1 MB 的中型文件Mostly medium files 512 KB-1 MB 大多为 1 MB 以上的大型文件Mostly large files > 1 MB
    Data Box DiskData Box Disk 4 个 Robocopy 会话*4 Robocopy sessions*
    每个会话 16 个线程16 threads per sessions
    2 个 Robocopy 会话*2 Robocopy sessions*
    每个会话 16 个线程16 threads per sessions
    2 个 Robocopy 会话*2 Robocopy sessions*
    每个会话 16 个线程16 threads per sessions

    *每个 Robocopy 会话最多可包含 7,000 个目录和 1.5 亿个文件。*Each Robocopy session can have a maximum of 7,000 directories and 150 million files.

    备注

    上面建议的参数基于内部测试中使用的环境。The parameters suggested above are based on the environment used in inhouse testing.

    有关 Robocopy 命令的详细信息,请转到 Robocopy 和几个示例For more information on Robocopy command, go to Robocopy and a few examples.

  6. 打开目标文件夹,查看并验证复制的文件。Open the target folder to view and verify the copied files. 如果复制过程中遇到任何错误,请下载用于故障排除的日志文件。If you have any errors during the copy process, download the log files for troubleshooting. 日志文件位于 robocopy 命令中指定的位置。The log files are located as specified in the robocopy command.

拆分数据并将其复制到磁盘Split and copy data to disks

如果使用多个磁盘,并且需要拆分大型数据集并将其复制到所有磁盘中,则可以使用此可选过程。This optional procedure may be used when you are using multiple disks and have a large dataset that needs to be split and copied across all the disks. 借助 Data Box 拆分复制工具可以在 Windows 计算机上拆分和复制数据。The Data Box Split Copy tool helps split and copy the data on a Windows computer.

重要

Data Box 拆分复制工具还会验证数据。Data Box Split Copy tool also validates your data. 如果使用 Data Box 拆分复制工具复制数据,则可以跳过验证步骤If you use Data Box Split Copy tool to copy data, you can skip the validation step. 托管磁盘不支持拆分复制工具。Split Copy tool is not supported with managed disks.

  1. 在 Windows 计算机上,请确保将 Data Box 拆分复制工具下载并提取到某个本地文件夹中。On your Windows computer, ensure that you have the Data Box Split Copy tool downloaded and extracted in a local folder. 下载适用于 Windows 的 Data Box Disk 工具集时已下载此工具。This tool was downloaded when you downloaded the Data Box Disk toolset for Windows.

  2. 打开文件资源管理器。Open File Explorer. 记下分配给 Data Box Disk 的数据源驱动器和驱动器号。Make a note of the data source drive and drive letters assigned to Data Box Disk.

    拆分复制数据

  3. 标识要复制的源数据。Identify the source data to copy. 例如,在本例中:For instance, in this case:

    • 标识了以下块 Blob 数据。Following block blob data was identified.

      拆分复制数据 2

    • 标识了以下页 Blob 数据。Following page blob data was identified.

      拆分复制数据 3

  4. 转到该软件已提取到的文件夹。Go to the folder where the software is extracted. 在该文件夹中找到 SampleConfig.json 文件。Locate the SampleConfig.json file in that folder. 这是一个可以修改和保存的只读文件。This is a read-only file that you can modify and save.

    拆分复制数据 4

  5. 修改 SampleConfig.json 文件。Modify the SampleConfig.json file.

    • 提供作业名称。Provide a job name. 这会在 Data Box Disk 中创建一个文件夹,该文件夹最终将成为与这些磁盘关联的 Azure 存储帐户中的容器。This creates a folder in the Data Box Disk and eventually becomes the container in the Azure storage account associated with these disks. 作业名称必须遵循 Azure 容器命名约定。The job name must follow the Azure container naming conventions.

    • SampleConfigFile.json 中提供源路径并记下路径格式。Supply a source path making note of the path format in the SampleConfigFile.json.

    • 输入对应于目标磁盘的驱动器号。Enter the drive letters corresponding to the target disks. 数据取自源路径,并在多个磁盘之间复制。The data is taken from the source path and copied across multiple disks.

    • 提供日志文件的路径。Provide a path for the log files. 默认情况下,日志文件将发送到 .exe 所在的当前目录中。By default, it is sent to the current directory where the .exe is located.

      拆分复制数据 5

  6. 若要验证文件格式,请转到 JSONlintTo validate the file format, go to JSONlint. 将文件另存为 ConfigFile.jsonSave the file as ConfigFile.json.

    拆分复制数据 6

  7. 打开命令提示符窗口。Open a Command Prompt window.

  8. 运行 DataBoxDiskSplitCopy.exeRun the DataBoxDiskSplitCopy.exe. 类型Type

    DataBoxDiskSplitCopy.exe PrepImport /config:<Your-config-file-name.json>

    拆分复制数据 7

  9. 按 Enter 继续运行脚本。Enter to continue the script.

    拆分复制数据 8

  10. 拆分并复制数据集后,会显示拆分复制工具的复制会话摘要。When the dataset is split and copied, the summary of the Split Copy tool for the copy session is presented. 下面显示了示例输出。A sample output is shown below.

    拆分复制数据 9

  11. 验证是否在目标磁盘之间拆分了数据。Verify that the data is split across the target disks.

    拆分复制数据 10 拆分复制数据 11Split copy data 10 Split copy data 11

    如果进一步检查 n: 驱动器的内容,将会看到已创建了对应于块 Blob 和页 Blob 格式数据的两个子文件夹。If you examine the contents of n: drive further, you will see that two sub-folders are created corresponding to block blob and page blob format data.

    拆分复制数据 12

  12. 如果复制会话失败,可使用以下命令予以恢复:If the copy session fails, then to recover and resume, use the following command:

    DataBoxDiskSplitCopy.exe PrepImport /config:<configFile.json> /ResumeSession

如果使用拆分复制工具时看到错误,请转到如何排查拆分复制工具错误If you see errors using the Split Copy tool, go to how to troubleshoot Split Copy tool errors.

数据复制完成后,可以继续验证数据。After the data copy is complete, you can proceed to validate your data. 如果使用了拆分复制工具,请跳过验证(拆分复制工具也会验证)并继续学习下一教程。If you used the Split Copy tool, skip the validation (Split Copy tool validates as well) and advance to the next tutorial.

验证数据Validate data

如果未使用拆分复制工具复制数据,则需要验证数据。If you did not use the Split Copy tool to copy data, you will need to validate your data. 若要验证数据,请执行以下步骤。To verify the data, perform the following steps.

  1. 运行 DataBoxDiskValidation.cmd 以在驱动器的 DataBoxDiskImport 文件夹中进行校验和验证。Run the DataBoxDiskValidation.cmd for checksum validation in the DataBoxDiskImport folder of your drive. 此功能仅适用于 Windows 环境。This is available for Windows environment only. Linux 用户需要验证复制到磁盘的源数据是否符合先决条件Linux users need to validate that the source data that is copied to the disk meets the prerequisites.

    Data Box Disk 验证工具输出

  2. 选择合适的选项。Choose the appropriate option. 建议你始终选择选项 2 来验证文件并生成校验和We recommend that you always validate the files and generate checksums by selecting option 2 . 根据具体的数据大小,此步骤可能需要一段时间。Depending upon your data size, this step may take a while. 在脚本完成后,退出命令窗口。Once the script has completed, exit out of the command window. 如果在验证和校验和生成过程中出现任何错误,则会向你发送通知并提供指向错误日志的链接。If there are any errors during validation and checksum generation, you are notified and a link to the error logs is also provided.

    校验和输出

    提示

    • 在两次运行之间请重置工具。Reset the tool between two runs.
    • 如果要处理包含小文件(数 KB)的大型数据集,请使用选项 1。Use option 1 if dealing with large data set containing small files (~ KBs). 此选项仅验证文件,因为校验和生成可能需要很长时间,并且执行速度可能会非常慢。This option only validates the files, as checksum generation may take a very long time and the performance could be very slow.
  3. 如果使用了多个磁盘,请对每个磁盘运行该命令。If using multiple disks, run the command for each disk.

如果在验证过程中看到错误,请参阅排查验证错误If you see errors during validation, see troubleshoot validation errors.

后续步骤Next steps

本教程介绍了有关 Azure Data Box 磁盘的主题,例如:In this tutorial, you learned about Azure Data Box Disk topics such as:

  • 将数据复制到 Data Box 磁盘Copy data to Data Box Disk
  • 验证数据完整性Verify data integrity

请继续学习下一篇教程,了解如何退回 Data Box 磁盘和验证向 Azure 上传数据的结果。Advance to the next tutorial to learn how to return the Data Box Disk and verify the data upload to Azure.

将数据复制到磁盘Copy data to disks

执行以下步骤,连接到计算机并将其上的数据复制到 Data Box Disk。Take the following steps to connect and copy data from your computer to the Data Box Disk.

  1. 查看已解锁的驱动器的内容。View the contents of the unlocked drive. 根据放置 Data Box Disk 顺序时选择的选项,驱动器中预先创建的文件夹和子文件夹的列表会有所不同。The list of the precreated folders and subfolders in the drive is different depending upon the options selected when placing the Data Box Disk order.

  2. 将数据复制到与适当数据格式对应的文件夹中。Copy the data to folders that correspond to the appropriate data format. 例如,将非结构化数据复制到 BlockBlob 文件夹,将 VHD 或 VHDX 数据复制到 PageBlob 文件夹,并将文件复制到 AzureFile 文件夹。For instance, copy the unstructured data to the folder for BlockBlob folder, VHD or VHDX data to PageBlob folder and files to AzureFile . 如果数据格式与相应的文件夹(存储类型)不匹配,则在后续步骤中,数据将无法上传到 Azure。If the data format does not match the appropriate folder (storage type), then at a later step, the data upload to Azure fails.

    • 请确保所有容器、blob 和文件都符合 Azure 命名约定Azure 对象大小限制Make sure that all the containers, blobs, and files conform to Azure naming conventions and Azure object size limits. 如果不遵循这些规则或限制,则无法将数据上传到 Azure。If these rules or limits are not followed, the data upload to Azure will fail.
    • 如果你的订单将托管磁盘作为存储目标之一,请参阅托管磁盘的命名约定。If your order has Managed Disks as one of the storage destinations, see the naming conventions for managed disks.
    • 在 Azure 存储帐户中,为 BlockBlob 和 PageBlob 文件夹下的每个子文件夹创建一个容器。A container is created in the Azure storage account for each subfolder under BlockBlob and PageBlob folders. BlockBlob 和 PageBlob 文件夹下的所有文件将复制到 Azure 存储帐户下的默认容器 $root 中。All files under BlockBlob and PageBlob folders are copied into a default container $root under the Azure Storage account. $root 容器中的所有文件将始终作为块 blob 上传。Any files in the $root container are always uploaded as block blobs.
    • 在 AzureFile 文件夹内创建子文件夹。Create a sub-folder within AzureFile folder. 此子文件夹将映射到云中的文件共享。This sub-folder maps to a fileshare in the cloud. 将文件复制到子文件夹。Copy files to the sub-folder. 直接复制到 AzureFile 文件夹的文件都会失败,会作为块 Blob 上传。Files copied directly to AzureFile folder fail and are uploaded as block blobs.
    • 如果根目录中存在文件和文件夹,则必须先将它们移到另一个文件夹,然后开始复制数据。If files and folders exist in the root directory, then you must move those to a different folder before you begin data copy.
  3. 使用文件资源管理器或任何与 SMB 兼容的文件复制工具(如 Robocopy)通过拖放来复制数据。Use drag and drop with File Explorer or any SMB compatible file copy tool such as Robocopy to copy your data. 可以使用以下命令启动多个复制作业:Multiple copy jobs can be initiated using the following command:

    Robocopy <source> <destination>  * /MT:64 /E /R:1 /W:1 /NFL /NDL /FFT /Log:c:\RobocopyLog.txt
    
  4. 打开目标文件夹,查看并验证复制的文件。Open the target folder to view and verify the copied files. 如果复制过程中遇到任何错误,请下载用于故障排除的日志文件。If you have any errors during the copy process, download the log files for troubleshooting. 日志文件位于 robocopy 命令中指定的位置。The log files are located as specified in the robocopy command.

如果使用多个磁盘,并且需要拆分大型数据集并将其复制到所有磁盘中,请使用拆分和复制的可选过程。Use the optional procedure of split and copy when you are using multiple disks and have a large dataset that needs to be split and copied across all the disks.

验证数据Validate data

执行以下步骤以验证数据。Take the following steps to verify your data.

  1. 运行 DataBoxDiskValidation.cmd 以在驱动器的 DataBoxDiskImport 文件夹中进行校验和验证。Run the DataBoxDiskValidation.cmd for checksum validation in the DataBoxDiskImport folder of your drive.

  2. 使用选项 2 验证文件并生成校验和。Use option 2 to validate your files and generate checksums. 根据具体的数据大小,此步骤可能需要一段时间。Depending upon your data size, this step may take a while. 如果在验证和校验和生成过程中出现任何错误,则会向你发送通知并提供指向错误日志的链接。If there are any errors during validation and checksum generation, you are notified and a link to the error logs is also provided.

    有关数据验证的详细信息,请参阅验证数据For more information on data validation, see Validate data. 如果在验证过程中遇到错误,请参阅排查验证错误If you experience errors during validation, see troubleshoot validation errors.