对 AzCopy 进行配置、优化和故障排除Configure, optimize, and troubleshoot AzCopy

AzCopy 是一个命令行实用工具,可用于向/从存储帐户复制 Blob 或文件。AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. 本文将帮助你执行高级配置任务,以排查使用 AzCopy 时可能出现的问题。This article helps you to perform advanced configuration tasks and helps you to troubleshoot issues that can arise as you use AzCopy.

配置代理设置Configure proxy settings

若要为 AzCopy 配置代理设置,请设置 https_proxy 环境变量。To configure the proxy settings for AzCopy, set the https_proxy environment variable. 如果在 Windows 中运行 AzCopy,AzCopy 会自动检测代理设置,因此你无需在 Windows 中使用此设置。If you run AzCopy on Windows, AzCopy automatically detects proxy settings, so you don't have to use this setting in Windows. 如果在 Windows 中选择使用此设置,此设置会替代自动检测。If you choose to use this setting in Windows, it will override automatic detection.

操作系统Operating system 命令Command
WindowsWindows 在命令提示符处使用 set https_proxy=<proxy IP>:<proxy port>In a command prompt use: set https_proxy=<proxy IP>:<proxy port>
在 PowerShell 中使用 $env:https_proxy="<proxy IP>:<proxy port>"In PowerShell use: $env:https_proxy="<proxy IP>:<proxy port>"
LinuxLinux export https_proxy=<proxy IP>:<proxy port>
macOSmacOS export https_proxy=<proxy IP>:<proxy port>

AzCopy 目前不支持要求通过 NTLM 或 Kerberos 进行身份验证的代理。Currently, AzCopy doesn't support proxies that require authentication with NTLM or Kerberos.

绕过代理Bypassing a proxy

如果是在 Windows 上运行 AzCopy,并且想让它完全不使用代理(而不是自动检测设置),请使用以下命令。If you are running AzCopy on Windows, and you want to tell it to use no proxy at all (instead of auto-detecting the settings) use these commands. 使用这些设置时,AzCopy 不会寻求使用或尝试使用任何代理。With these settings, AzCopy will not look up or attempt to use any proxy.

操作系统Operating system 环境Environment 命令Commands
WindowsWindows 命令提示符 (CMD)Command prompt (CMD) set HTTPS_PROXY=dummy.invalid
set NO_PROXY=*
WindowsWindows PowerShellPowerShell $env:HTTPS_PROXY="dummy.invalid"
$env:NO_PROXY="*"

在其他操作系统上,如果想要不使用代理,只需不设置 HTTPS_PROXY 变量即可。On other operating systems, simply leave the HTTPS_PROXY variable unset if you want to use no proxy.

优化性能Optimize performance

可以指定性能基准,然后使用命令和环境变量在性能与资源消耗量之间找到最佳的平衡。You can benchmark performance, and then use commands and environment variables to find an optimal tradeoff between performance and resource consumption.

本部分将帮助你执行以下优化任务:This section helps you perform these optimization tasks:

  • 运行基准测试Run benchmark tests
  • 优化吞吐量Optimize throughput
  • 优化内存用量Optimize memory use
  • 优化文件同步Optimize file synchronization

运行基准测试Run benchmark tests

可对特定的 Blob 容器或文件共享运行性能基准测试,以查看常规的性能统计信息和识别性能瓶颈。You can run a performance benchmark test on specific blob containers or file shares to view general performance statistics and to identity performance bottlenecks.

使用以下命令运行性能基准测试。Use the following command to run a performance benchmark test.

语法Syntax azcopy benchmark 'https://<storage-account-name>.blob.core.chinacloudapi.cn/<container-name>'
示例Example azcopy benchmark 'https://mystorageaccount.blob.core.chinacloudapi.cn/mycontainer/myBlobDirectory?sv=2018-03-28&ss=bjqt&srs=sco&sp=rjklhjup&se=2019-05-10T04:37:48Z&st=2019-05-09T20:37:48Z&spr=https&sig=%2FSOVEFfsKDqRry4bk3qz1vAQFwY5DDzp2%2B%2F3Eykf%2FJLs%3D'

提示

此示例将路径参数括在单引号 ('') 内。This example encloses path arguments with single quotes (''). 在除 Windows 命令 Shell (cmd.exe) 以外的所有命令 shell 中,都请使用单引号。Use single quotes in all command shells except for the Windows Command Shell (cmd.exe). 如果使用 Windows 命令 Shell (cmd.exe),请用双引号 ("") 而不是单引号 ('') 括住路径参数。If you're using a Windows Command Shell (cmd.exe), enclose path arguments with double quotes ("") instead of single quotes ('').

此命令通过将测试数据上传到指定的目标来运行性能基准测试。This command runs a performance benchmark by uploading test data to a specified destination. 测试数据将在内存中生成、上传到目标,并在完成测试后从目标中删除。The test data is generated in memory, uploaded to the destination, then deleted from the destination after the test is complete. 可以使用可选的命令参数来指定要生成的文件数以及文件的大小。You can specify how many files to generate and what size you'd like them to be by using optional command parameters.

如需详细的参考文档,请参阅 azcopy benchmarkFor detailed reference docs, see azcopy benchmark.

若要查看此命令的详细帮助指导,请键入 azcopy benchmark -h 并按 ENTER 键。To view detailed help guidance for this command, type azcopy benchmark -h and then press the ENTER key.

优化吞吐量Optimize throughput

可以在命令中使用 cap-mbps 标志来设置吞吐量数据速率的上限。You can use the cap-mbps flag in your commands to place a ceiling on the throughput data rate. 例如,以下命令恢复作业并将吞吐量上限设置为每秒 10 兆位 (Mb)。For example, the following command resumes a job and caps throughput to 10 megabits (Mb) per second.

azcopy jobs resume <job-id> --cap-mbps 10

传输小型文件时,吞吐量可能会下降。Throughput can decrease when transferring small files. 可以设置 AZCOPY_CONCURRENCY_VALUE 环境变量来提高吞吐量。You can you can increase throughput by setting the AZCOPY_CONCURRENCY_VALUE environment variable. 此变量指定可发生的并发请求数。This variable specifies the number of concurrent requests that can occur.

如果计算机中的 CPU 少于 5 个,则此变量的值将设置为 32If your computer has fewer than 5 CPUs, then the value of this variable is set to 32. 否则,默认值等于 16 乘以 CPU 数。Otherwise, the default value is equal to 16 multiplied by the number of CPUs. 此变量的最大默认值为 3000,但可以手动增大或减小此值。The maximum default value of this variable is 3000, but you can manually set this value higher or lower.

操作系统Operating system 命令Command
WindowsWindows set AZCOPY_CONCURRENCY_VALUE=<value>
LinuxLinux export AZCOPY_CONCURRENCY_VALUE=<value>
macOSmacOS export AZCOPY_CONCURRENCY_VALUE=<value>

使用 azcopy env 检查此变量的当前值。Use the azcopy env to check the current value of this variable. 如果该值为空白,你可以通过查看任何 AzCopy 日志文件的开头部分来读取所用的值。If the value is blank, then you can read which value is being used by looking at the beginning of any AzCopy log file. 日志中会报告所选的值以及选择该值的原因。The selected value, and the reason it was selected, are reported there.

在设置此变量之前,我们建议运行基准测试。Before you set this variable, we recommend that you run a benchmark test. 基准测试过程将报告建议的并发值。The benchmark test process will report the recommended concurrency value. 或者,如果网络条件和有效负载不同,请将此变量设置为单词 AUTO 而不是特定的数字。Alternatively, if your network conditions and payloads vary, set this variable to the word AUTO instead of to a particular number. 这样,AzCopy 始终会运行它在基准测试中使用的相同自动优化过程。That will cause AzCopy to always run the same automatic tuning process that it uses in benchmark tests.

优化内存用量Optimize memory use

设置 AZCOPY_BUFFER_GB 环境变量,以指定 AzCopy 在下载和上传文件时要使用的最大系统内存量。Set the AZCOPY_BUFFER_GB environment variable to specify the maximum amount of your system memory you want AzCopy to use when downloading and uploading files. 请以 GB 表示此值。Express this value in gigabytes (GB).

操作系统Operating system 命令Command
WindowsWindows set AZCOPY_BUFFER_GB=<value>
LinuxLinux export AZCOPY_BUFFER_GB=<value>
macOSmacOS export AZCOPY_BUFFER_GB=<value>

优化文件同步Optimize file synchronization

sync 命令标识目标中的所有文件,然后在开始同步操作前比较文件名和上次修改的时间戳。The sync command identifies all files at the destination, and then compares file names and last modified timestamps before the starting the sync operation. 如果有大量文件,则可通过消除此前期处理来提高性能。If you have a large number of files, then you can improve performance by eliminating this up-front processing.

若要实现此目的,请改用 azcopy copy 命令,并将 --overwrite 标志设置为 ifSourceNewerTo accomplish this, use the azcopy copy command instead, and set the --overwrite flag to ifSourceNewer. AzCopy 会在文件复制时比较文件,而不执行任何预先扫描和比较。AzCopy will compare files as they are copied without performing any up-front scans and comparisons. 如果有大量文件要比较,这会提供性能优势。This provides a performance edge in cases where there are a large number of files to compare.

azcopy copy 命令不会从目标中删除文件,因此,若要在源中不存在文件时删除目标中的文件,请使用 azcopy sync 命令,并将 --delete-destination 标志设置为 trueprompt 值。The azcopy copy command doesn't delete files from the destination, so if you want to delete files at the destination when they no longer exist at the source, then use the azcopy sync command with the --delete-destination flag set to a value of true or prompt.

排查问题Troubleshoot issues

AzCopy 为每个作业创建日志和计划文件。AzCopy creates log and plan files for every job. 可以使用日志调查并解决任何潜在问题。You can use the logs to investigate and troubleshoot any potential problems.

日志将包含失败状态(UPLOADFAILEDCOPYFAILEDDOWNLOADFAILED)、完整路径和失败的原因。The logs will contain the status of failure (UPLOADFAILED, COPYFAILED, and DOWNLOADFAILED), the full path, and the reason of the failure.

默认情况下,日志和计划文件位于 Windows 上的 %USERPROFILE%\.azcopy 目录中或 Mac 和 Linux 上的 $HOME$\.azcopy 目录中,但可根据需要更改此位置。By default, the log and plan files are located in the %USERPROFILE%\.azcopy directory on Windows or $HOME$\.azcopy directory on Mac and Linux, but you can change that location if you want.

相关错误不一定是文件中出现的第一个错误。The relevant error isn't necessarily the first error that appears in the file. 对于网络错误、超时和服务器忙等错误,AzCopy 将重试最多 20 次,通常重试过程会成功。For errors such as network errors, timeouts and Server Busy errors, AzCopy will retry up to 20 times and usually the retry process succeeds. 你看到的第一个错误可能是已成功重试的无害内容。The first error that you see might be something harmless that was successfully retried. 因此,请查找 UPLOADFAILEDCOPYFAILEDDOWNLOADFAILED 附近的错误,而不是查看文件中的第一个错误。So instead of looking at the first error in the file, look for the errors that are near UPLOADFAILED, COPYFAILED, or DOWNLOADFAILED.

重要

向 Azure 支持部门提交请求时(或者排查涉及第三方的问题时),请共享想要执行的命令的编校版本。When submitting a request to Azure Support (or troubleshooting the issue involving any third party), share the redacted version of the command you want to execute. 这可以确保不会意外地与任何人共享 SAS。This ensures the SAS isn't accidentally shared with anybody. 可以在日志文件的开头找到经修订的版本。You can find the redacted version at the start of the log file.

查看日志中的错误Review the logs for errors

以下命令从 04dc9ca9-158f-7945-5933-564021086c79 日志中获取 UPLOADFAILED 状态的所有错误:The following command will get all errors with UPLOADFAILED status from the 04dc9ca9-158f-7945-5933-564021086c79 log:

Windows (PowerShell)Windows (PowerShell)

Select-String UPLOADFAILED .\04dc9ca9-158f-7945-5933-564021086c79.log

LinuxLinux

grep UPLOADFAILED .\04dc9ca9-158f-7945-5933-564021086c79.log

查看和恢复作业View and resume jobs

每个传输操作都将创建一个 AzCopy 作业。Each transfer operation will create an AzCopy job. 使用以下命令查看作业的历史记录:Use the following command to view the history of jobs:

azcopy jobs list

若要查看作业统计信息,请使用以下命令:To view the job statistics, use the following command:

azcopy jobs show <job-id>

若要按状态筛选传输,请使用以下命令:To filter the transfers by status, use the following command:

azcopy jobs show <job-id> --with-status=Failed

使用以下命令恢复失败/取消的作业。Use the following command to resume a failed/canceled job. 此命令使用该作业的标识符以及 SAS 令牌,因为它不是持久性的(出于安全方面的原因):This command uses its identifier along with the SAS token as it isn't persistent for security reasons:

azcopy jobs resume <job-id> --source-sas="<sas-token>"
azcopy jobs resume <job-id> --destination-sas="<sas-token>"

提示

用单引号 ('') 将路径参数(如 SAS 令牌)括起来。Enclose path arguments such as the SAS token with single quotes (''). 在除 Windows 命令 Shell (cmd.exe) 以外的所有命令 shell 中,都请使用单引号。Use single quotes in all command shells except for the Windows Command Shell (cmd.exe). 如果使用 Windows 命令 Shell (cmd.exe),请用双引号 ("") 而不是单引号 ('') 括住路径参数。If you're using a Windows Command Shell (cmd.exe), enclose path arguments with double quotes ("") instead of single quotes ('').

恢复某个作业时,AzCopy 会查看作业计划文件。When you resume a job, AzCopy looks at the job plan file. 该计划文件列出了首次创建该作业时标识为待处理的所有文件。The plan file lists all the files that were identified for processing when the job was first created. 恢复某个作业时,AzCopy 会尝试传输计划文件中列出的且尚未传输的所有文件。When you resume a job, AzCopy will attempt to transfer all of the files that are listed in the plan file which weren't already transferred.

更改计划和日志文件的位置Change the location of the plan and log files

默认情况下,计划和日志文件位于 Windows 上的 %USERPROFILE%\.azcopy 目录中,或 Mac 和 Linux 上的 $HOME$\.azcopy 目录中。By default, plan and log files are located in the %USERPROFILE%\.azcopy directory on Windows, or in the $HOME$\.azcopy directory on Mac and Linux. 可以更改此位置。You can change this location.

更改计划文件的位置Change the location of plan files

使用以下任何命令。Use any of these commands.

操作系统Operating system 命令Command
WindowsWindows PowerShell:$env:AZCOPY_JOB_PLAN_LOCATION="<value>"PowerShell:$env:AZCOPY_JOB_PLAN_LOCATION="<value>"
在命令提示符处使用 set AZCOPY_JOB_PLAN_LOCATION=<value>In a command prompt use:: set AZCOPY_JOB_PLAN_LOCATION=<value>
LinuxLinux export AZCOPY_JOB_PLAN_LOCATION=<value>
macOSmacOS export AZCOPY_JOB_PLAN_LOCATION=<value>

使用 azcopy env 检查此变量的当前值。Use the azcopy env to check the current value of this variable. 如果该值为空白,则计划文件将写入默认位置。If the value is blank, then plan files are written to the default location.

更改日志文件的位置Change the location of log files

使用以下任何命令。Use any of these commands.

操作系统Operating system 命令Command
WindowsWindows PowerShell:$env:AZCOPY_LOG_LOCATION="<value>"PowerShell:$env:AZCOPY_LOG_LOCATION="<value>"
在命令提示符处使用 set AZCOPY_LOG_LOCATION=<value>In a command prompt use:: set AZCOPY_LOG_LOCATION=<value>
LinuxLinux export AZCOPY_LOG_LOCATION=<value>
macOSmacOS export AZCOPY_LOG_LOCATION=<value>

使用 azcopy env 检查此变量的当前值。Use the azcopy env to check the current value of this variable. 如果该值为空白,则日志将写入默认位置。If the value is blank, then logs are written to the default location.

更改默认日志级别Change the default log level

AzCopy 日志级别默认设置为 INFOBy default, AzCopy log level is set to INFO. 若要降低日志详细程度以节省磁盘空间,请使用 --log-level 选项覆盖此设置。If you would like to reduce the log verbosity to save disk space, overwrite this setting by using the --log-level option.

可用的日志级别:NONEDEBUGINFOWARNINGERRORPANICFATALAvailable log levels are: NONE, DEBUG, INFO, WARNING, ERROR, PANIC, and FATAL.

删除计划和日志文件Remove plan and log files

若要从本地计算机中删除所有计划和日志文件以节省磁盘空间,请使用 azcopy jobs clean 命令。If you want to remove all plan and log files from your local machine to save disk space, use the azcopy jobs clean command.

若要删除只与一个作业关联的计划和日志文件,请使用 azcopy jobs rm <job-id>To remove the plan and log files associated with only one job, use azcopy jobs rm <job-id>. 请将此示例中的 <job-id> 占位符替换为作业的作业 ID。Replace the <job-id> placeholder in this example with the job id of the job.