使用 AzCopy 将数据从 Amazon S3 复制到 Azure 存储Copy data from Amazon S3 to Azure Storage by using AzCopy

AzCopy 是一个命令行实用工具,可用于向/从存储帐户复制 Blob 或文件。AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. 本文介绍如何使用 AzCopy 将对象、目录和桶从 Amazon Web Services (AWS) S3 复制到 Azure Blob 存储。This article helps you copy objects, directories, and buckets from Amazon Web Services (AWS) S3 to Azure blob storage by using AzCopy.

选择如何提供授权凭据Choose how you'll provide authorization credentials

  • 若要对 Azure 存储授权,请使用 Azure Active Directory (AD) 或共享访问签名 (SAS) 令牌。To authorize with the Azure Storage, use Azure Active Directory (AD) or a Shared Access Signature (SAS) token.

  • 若要对 AWS S3 授权,请使用 AWS 访问密钥和机密访问密钥。To authorize with AWS S3, use an AWS access key and a secret access key.

对 Azure 存储授权Authorize with Azure Storage

请参阅 AzCopy 入门一文下载 AzCopy,并选择如何提供存储服务的授权凭据。See the Get started with AzCopy article to download AzCopy, and choose how you'll provide authorization credentials to the storage service.

备注

本文中的示例假设你已使用 AzCopy login 命令验证自己的身份。The examples in this article assume that you've authenticated your identity by using the AzCopy login command. 然后,AzCopy 会使用你的 Azure AD 帐户来授权访问 Blob 存储中的数据。AzCopy then uses your Azure AD account to authorize access to data in Blob storage.

如果你希望使用 SAS 令牌来授权访问 Blob 数据,可将该令牌追加到每个 AzCopy 命令中的资源 URL。If you'd rather use a SAS token to authorize access to blob data, then you can append that token to the resource URL in each AzCopy command.

例如:https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer?<SAS-token>For example: https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer?<SAS-token>.

对 AWS S3 授权Authorize with AWS S3

收集 AWS 访问密钥和机密访问密钥,然后设置以下环境变量:Gather your AWS access key and secret access key, and then set the these environment variables:

操作系统Operating system 命令Command
WindowsWindows set AWS_ACCESS_KEY_ID=<access-key>
set AWS_SECRET_ACCESS_KEY=<secret-access-key>
LinuxLinux export AWS_ACCESS_KEY_ID=<access-key>
export AWS_SECRET_ACCESS_KEY=<secret-access-key>
MacOSMacOS export AWS_ACCESS_KEY_ID=<access-key>
export AWS_SECRET_ACCESS_KEY=<secret-access-key>

复制对象、目录和桶Copy objects, directories, and buckets

AzCopy 使用从 URL 放置块 API,因此数据将在 AWS S3 与存储服务器之间直接复制。AzCopy uses the Put Block From URL API, so data is copied directly between AWS S3 and storage servers. 这些复制操作不会占用计算机的网络带宽。These copy operations don't use the network bandwidth of your computer.

提示

本部分中的示例将路径参数括在单引号 ('') 中。The examples in this section enclose path arguments with single quotes (''). 在除 Windows 命令 Shell (cmd.exe) 以外的所有命令 shell 中,都请使用单引号。Use single quotes in all command shells except for the Windows Command Shell (cmd.exe). 如果使用 Windows 命令 Shell (cmd.exe),请用双引号 ("") 而不是单引号 ('') 括住路径参数。If you're using a Windows Command Shell (cmd.exe), enclose path arguments with double quotes ("") instead of single quotes ('').

这些示例也适用于具有分层命名空间的帐户。These examples also work with accounts that have a hierarchical namespace. Data Lake Storage 上的多协议访问使你可以在这些帐户上使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Multi-protocol access on Data Lake Storage enables you to use the same URL syntax (blob.core.chinacloudapi.cn) on those accounts.

复制对象Copy an object

对具有分层命名空间的帐户使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Use the same URL syntax (blob.core.chinacloudapi.cn) for accounts that have a hierarchical namespace.

语法Syntax azcopy copy 'https://s3.amazonaws.com/<bucket-name>/<object-name>' 'https://<storage-account-name>.blob.core.chinacloudapi.cn/<container-name>/<blob-name>'
示例Example azcopy copy 'https://s3.amazonaws.com/mybucket/myobject' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/myblob'
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/myobject' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/myblob'

备注

本文中的示例对 AWS S3 桶使用路径样式的 URL(例如:http://s3.amazonaws.com/<bucket-name>)。Examples in this article use path-style URLs for AWS S3 buckets (For example: http://s3.amazonaws.com/<bucket-name>).

也可以使用虚拟托管样式的 URL(例如:http://bucket.s3.amazonaws.com)。You can also use virtual hosted-style URLs as well (For example: http://bucket.s3.amazonaws.com).

若要详细了解桶的虚拟托管,请参阅 [桶的虚拟托管]](https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html)。To learn more about virtual hosting of buckets, see [Virtual Hosting of Buckets]](https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html).

复制目录Copy a directory

对具有分层命名空间的帐户使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Use the same URL syntax (blob.core.chinacloudapi.cn) for accounts that have a hierarchical namespace.

语法Syntax azcopy copy 'https://s3.amazonaws.com/<bucket-name>/<directory-name>' 'https://<storage-account-name>.blob.core.chinacloudapi.cn/<container-name>/<directory-name>' --recursive=true
示例Example azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true

备注

此示例追加 --recursive 标志以复制所有子目录中的文件。This example appends the --recursive flag to copy files in all sub-directories.

复制目录的内容Copy the contents of a directory

可以使用通配符 (*) 复制目录的内容,而无需复制包含内容的目录本身。You can copy the contents of a directory without copying the containing directory itself by using the wildcard symbol (*).

语法Syntax azcopy copy 'https://s3.amazonaws.com/<bucket-name>/<directory-name>/*' 'https://<storage-account-name>.blob.core.chinacloudapi.cn/<container-name>/<directory-name>' --recursive=true
示例Example azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory/*' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory/*' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true

复制桶Copy a bucket

对具有分层命名空间的帐户使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Use the same URL syntax (blob.core.chinacloudapi.cn) for accounts that have a hierarchical namespace.

语法Syntax azcopy copy 'https://s3.amazonaws.com/<bucket-name>' 'https://<storage-account-name>.blob.core.chinacloudapi.cn/<container-name>' --recursive=true
示例Example azcopy copy 'https://s3.amazonaws.com/mybucket' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer' --recursive=true
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true

复制所有区域中的所有桶Copy all buckets in all regions

对具有分层命名空间的帐户使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Use the same URL syntax (blob.core.chinacloudapi.cn) for accounts that have a hierarchical namespace.

语法Syntax azcopy copy 'https://s3.amazonaws.com/' 'https://<storage-account-name>.blob.core.chinacloudapi.cn' --recursive=true
示例Example azcopy copy 'https://s3.amazonaws.com' 'https://mystorageaccount.blob.core.chinacloudapi.cn' --recursive=true
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true

复制特定 S3 区域中的所有桶Copy all buckets in a specific S3 region

对具有分层命名空间的帐户使用相同的 URL 语法 (blob.core.chinacloudapi.cn)。Use the same URL syntax (blob.core.chinacloudapi.cn) for accounts that have a hierarchical namespace.

语法Syntax azcopy copy 'https://s3-<region-name>.amazonaws.com/' 'https://<storage-account-name>.blob.core.chinacloudapi.cn' --recursive=true
示例Example azcopy copy 'https://s3-rds.eu-north-1.amazonaws.com' 'https://mystorageaccount.blob.core.chinacloudapi.cn' --recursive=true
示例(分层命名空间)Example (hierarchical namespace) azcopy copy 'https://s3.amazonaws.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/mydirectory' --recursive=true

处理对象命名规则的差异Handle differences in object naming rules

相比 Azure Blob 容器,AWS S3 对桶的名称实施一组不同的命名约定。AWS S3 has a different set of naming conventions for bucket names as compared to Azure blob containers. 可在此处了解相关信息。You can read about them here. 如果选择将一组桶复制到 Azure 存储帐户,复制操作可能因命名差异而失败。If you choose to copy a group of buckets to an Azure storage account, the copy operation might fail because of naming differences.

AzCopy 会处理可能出现的两个最常见问题:包含句点的桶,以及包含连续连字符的桶。AzCopy handles two of the most common issues that can arise; buckets that contain periods and buckets that contain consecutive hyphens. AWS S3 桶名称可以包含句点和连续的连字符,但 Azure 中的容器则不可以。AWS S3 bucket names can contain periods and consecutive hyphens, but a container in Azure can't. AzCopy 会将句点替换为连字符,并将连续的连字符替换为表示连续连字符数目的数字(例如:名为 my----bucket 的桶将变成 my-4-bucket)。AzCopy replaces periods with hyphens and consecutive hyphens with a number that represents the number of consecutive hyphens (For example: a bucket named my----bucket becomes my-4-bucket.

此外,在 AzCopy 复制文件时,它会检查并尝试解决命名冲突。Also, as AzCopy copies over files, it checks for naming collisions and attempts to resolve them. 例如,如果存在名为 bucket-namebucket.name 的桶,则 AzCopy 会将名为 bucket.name 的桶解析为 bucket-name,再将后者解析为 bucket-name-2For example, if there are buckets with the name bucket-name and bucket.name, AzCopy resolves a bucket named bucket.name first to bucket-name and then to bucket-name-2.

处理对象元数据的差异Handle differences in object metadata

AWS S3 和 Azure 允许在对象键名称中使用不同的字符集。AWS S3 and Azure allow different sets of characters in the names of object keys. 可在此处了解 AWS S3 使用的字符。You can read about the characters that AWS S3 uses here. 在 Azure 端,Blob 对象键遵守 C# 标识符的命名规则。On the Azure side, blob object keys adhere to the naming rules for C# identifiers.

在 AzCopy copy 命令中,可为 s2s-handle-invalid-metadata 可选标志提供一个值,用于指定如何处理其中的元数据包含不兼容键名称的文件。As part of an AzCopy copy command, you can provide a value for optional the s2s-handle-invalid-metadata flag that specifies how you would like to handle files where the metadata of the file contains incompatible key names. 下表描述了每个标志值。The following table describes each flag value.

标志值Flag value 说明Description
ExcludeIfInvalidExcludeIfInvalid (默认选项)不在传输的对象中包含元数据。(Default option) The metadata isn't included in the transferred object. AzCopy 将记录警告。AzCopy logs a warning.
FailIfInvalidFailIfInvalid 不复制对象。Objects aren't copied. AzCopy 将记录错误,并将该错误包含到传输摘要显示的失败计数中。AzCopy logs an error and includes that error in the failed count that appears in the transfer summary.
RenameIfInvalidRenameIfInvalid AzCopy 将解析无效的元数据键,并使用已解析的元数据键值对将对象复制到 Azure。AzCopy resolves the invalid metadata key, and copies the object to Azure using the resolved metadata key value pair. 若要确切地了解 AzCopy 采取哪些步骤来重命名对象键,请参阅下面的 AzCopy 如何重命名对象键部分。To learn exactly what steps AzCopy takes to rename object keys, see the How AzCopy renames object keys section below. 如果 AzCopy 无法重命名该键,则不会复制该对象。If AzCopy is unable to rename the key, then the object won't be copied.

AzCopy 如何重命名对象键How AzCopy renames object keys

AzCopy 执行以下步骤:AzCopy performs these steps:

  1. 将无效字符替换为“”。Replaces invalid characters with ''.

  2. 将字符串 rename_ 添加到新的有效键的开头。Adds the string rename_ to the beginning of a new valid key.

    此键将用于保存原始元数据的This key will be used to save the original metadata value.

  3. 将字符串 rename_key_ 添加到新的有效键的开头。Adds the string rename_key_ to the beginning of a new valid key. 此键将用于保存原始元数据的无效This key will be used to save original metadata invalid key. 可以使用此键在 Azure 端尝试恢复元数据,因为元数据键作为值保留在 Blob 存储服务中。You can use this key to try and recover the metadata in Azure side since metadata key is preserved as a value on the Blob storage service.

后续步骤Next steps

在以下文章中查找更多示例:Find more examples in any of these articles: