对 Azure 存储进行监视、诊断和故障排除Monitor, diagnose, and troubleshoot Azure Storage

概述Overview

诊断和排查在云环境中托管的分布式应用程序中的问题可能会比在传统环境中更复杂。Diagnosing and troubleshooting issues in a distributed application hosted in a cloud environment can be more complex than in traditional environments. 应用程序可以部署在 PaaS 或 IaaS 基础结构、本地、移动设备,或这些环境的某种组合中。Applications can be deployed in a PaaS or IaaS infrastructure, on premises, on a mobile device, or in some combination of these environments. 通常,应用程序的网络流量可能会经过公用和专用网络,且应用程序可以使用多种存储技术(如 Azure 存储表、Blob、队列或文件)以及其他数据存储(如关系型数据库和文档数据库)。Typically, your application's network traffic may traverse public and private networks and your application may use multiple storage technologies such as Azure Storage Tables, Blobs, Queues, or Files in addition to other data stores such as relational and document databases.

若要成功管理此类应用程序,应主动监视这些应用程序,并了解如何诊断和排查这些应用程序及其相关技术的所有方面的问题。To manage such applications successfully you should monitor them proactively and understand how to diagnose and troubleshoot all aspects of them and their dependent technologies. 作为 Azure 存储服务的用户,应持续监视应用程序所用的存储服务是否出现任何意外的行为更改(如比正常响应时间慢),并使用日志记录收集更详细的数据并深入分析问题。As a user of Azure Storage services, you should continuously monitor the Storage services your application uses for any unexpected changes in behavior (such as slower than usual response times), and use logging to collect more detailed data and to analyze a problem in depth. 从监视和日志记录获取的诊断信息将有助于确定应用程序所遇到问题的根本原因。The diagnostics information you obtain from both monitoring and logging will help you to determine the root cause of the issue your application encountered. 然后,用户可以排查该问题,并确定可以执行以更正该问题的相应步骤。Then you can troubleshoot the issue and determine the appropriate steps you can take to remediate it. Azure 存储是一项核心 Azure 服务,它是客户部署到 Azure 基础结构的大多数解决方案的重要组成部分。Azure Storage is a core Azure service, and forms an important part of the majority of solutions that customers deploy to the Azure infrastructure. Azure 存储提供的功能可以简化监视、诊断和排查基于云的应用程序中的存储问题的过程。Azure Storage includes capabilities to simplify monitoring, diagnosing, and troubleshooting storage issues in your cloud-based applications.

Note

Azure 文件目前不支持日志记录。Azure Files does not support logging at this time.

有关 Azure 存储应用程序中端到端故障排除的动手指南,请参阅端到端故障排除 - 使用 Azure 存储指标和日志记录、AzCopy 和 Message AnalyzerFor a hands-on guide to end-to-end troubleshooting in Azure Storage applications, see End-to-End Troubleshooting using Azure Storage Metrics and Logging, AzCopy, and Message Analyzer.

介绍Introduction

本指南演示如何使用 Azure 存储客户端库中的 Azure 存储分析、客户端日志记录等功能及其他第三方工具,确定、诊断和排查与 Azure 存储相关的问题。This guide shows you how to use features such as Azure Storage Analytics, client-side logging in the Azure Storage Client Library, and other third-party tools to identify, diagnose, and troubleshoot Azure Storage related issues.

本指南的主要目标受众是开发使用 Azure 存储服务的联机服务的开发人员以及负责管理此类联机服务的 IT 专业人员。This guide is intended to be read primarily by developers of online services that use Azure Storage Services and IT Pros responsible for managing such online services. 本指南的目标是:The goals of this guide are:

  • 帮助你维护 Azure 存储帐户的运行状况和性能。To help you maintain the health and performance of your Azure Storage accounts.
  • 提供必要的过程和工具来帮助你确定应用程序中的问题是否与 Azure 存储有关。To provide you with the necessary processes and tools to help you decide whether an issue or problem in an application relates to Azure Storage.
  • 为提供用于解决与 Azure 存储相关的问题的可操作指南。To provide you with actionable guidance for resolving problems related to Azure Storage.

本指南的组织方式How this guide is organized

监视存储服务”一节介绍如何使用 Azure 存储分析度量值(存储度量值)监视 Azure 存储服务的运行状况和性能。The section "Monitoring your storage service" describes how to monitor the health and performance of your Azure Storage services using Azure Storage Analytics Metrics (Storage Metrics).

诊断存储问题”一节介绍如何使用 Azure 存储分析日志记录(存储日志记录)诊断问题。The section "Diagnosing storage issues" describes how to diagnose issues using Azure Storage Analytics Logging (Storage Logging). 它还介绍了如何使用其中一个客户端库(如 .NET 存储客户端库或 Azure SDK for Java)中的工具启用客户端日志记录。It also describes how to enable client-side logging using the facilities in one of the client libraries such as the Storage Client Library for .NET or the Azure SDK for Java.

端到端跟踪”一节介绍了如何关联各种日志文件中包含的信息和指标数据。The section "End-to-end tracing" describes how you can correlate the information contained in various log files and metrics data.

故障排除指南”一节针对可能会遇到的一些与存储相关的常见问题,提供了故障排除指南。The section "Troubleshooting guidance" provides troubleshooting guidance for some of the common storage-related issues you might encounter.

附录”提供了有关如何使用其他工具的信息,例如如何使用 Wireshark 和 Netmon 分析网络数据包数据,如何使用 Fiddler 分析 HTTP/HTTPS 消息,以及如何使用 Microsoft Message Analyzer 关联日志数据等。The "Appendices" include information about using other tools such as Wireshark and Netmon for analyzing network packet data, Fiddler for analyzing HTTP/HTTPS messages, and Microsoft Message Analyzer for correlating log data.

监视存储服务Monitoring your storage service

如果熟悉 Windows 性能监视,则可以将存储度量值视为 Windows 性能监视器计数器的 Azure 存储等效项。If you are familiar with Windows performance monitoring, you can think of Storage Metrics as being an Azure Storage equivalent of Windows Performance Monitor counters. 在“存储指标”中,可找到一组综合指标(相当于 Windows 性能监视器术语中的计数器),例如服务可用性、向服务发送的请求总数或向服务发出的成功请求的百分比。In Storage Metrics, you will find a comprehensive set of metrics (counters in Windows Performance Monitor terminology) such as service availability, total number of requests to service, or percentage of successful requests to service. 有关可用度量值的完整列表,请参阅存储分析度量值表架构For a full list of the available metrics, see Storage Analytics Metrics Table Schema. 可以指定希望存储服务每隔一小时还是每隔一分钟收集和聚合一次度量值。You can specify whether you want the storage service to collect and aggregate metrics every hour or every minute. 有关如何启用度量值和监视存储帐户的详细信息,请参阅 Enabling storage metrics and viewing metrics data(启用存储度量值并查看度量值数据)。For more information about how to enable metrics and monitor your storage accounts, see Enabling storage metrics and viewing metrics data.

可以选择要在 Azure 门户中显示哪些每小时度量值,并配置规则以便在每小时度量值超过特定阈值时,通过电子邮件通知管理员。You can choose which hourly metrics you want to display in the Azure portal and configure rules that notify administrators by email whenever an hourly metric exceeds a particular threshold. 有关详细信息,请参阅接收警报通知For more information, see Receive Alert Notifications.

存储服务尽最大努力收集指标,但可能无法记录每个存储操作。The storage service collects metrics using a best effort, but may not record every storage operation.

在 Azure 门户中,可以查看存储帐户的度量值,如可用性、请求总数和平均延迟数。In the Azure portal, you can view metrics such as availability, total requests, and average latency numbers for a storage account. 也已设置通知规则,以便在可用性下降到低于某个级别时向管理员发出警报。A notification rule has also been set up to alert an administrator if availability drops below a certain level. 通过查看此数据,一个可能的调查方面是表服务成功百分比低于 100%(有关详细信息,请参阅“度量值显示低 PercentSuccess,或者分析日志项包含事务状态为 ClientOtherErrors 的操作”一节)。From viewing this data, one possible area for investigation is the table service success percentage being below 100% (for more information, see the section "Metrics show low PercentSuccess or analytics log entries have operations with transaction status of ClientOtherErrors").

应通过以下方式持续监视 Azure 应用程序以确保它们正常运行并按预期执行操作:You should continuously monitor your Azure applications to ensure they are healthy and performing as expected by:

  • 为应用程序建立一些基准指标,以用于比较当前数据和识别 Azure 存储和应用程序的任何重大行为更改。Establishing some baseline metrics for application that will enable you to compare current data and identify any significant changes in the behavior of Azure storage and your application. 在许多情况下,基准指标的值将特定于应用程序,因此应在对应用程序进行性能测试时建立这些指标。The values of your baseline metrics will, in many cases, be application specific and you should establish them when you are performance testing your application.
  • 记录每分钟度量值,并使用这些度量值来主动监视意外的错误和异常,例如错误计数或请求速率达到峰值。Recording minute metrics and using them to monitor actively for unexpected errors and anomalies such as spikes in error counts or request rates.
  • 记录每小时度量值,并使用这些度量值来监视平均值,例如平均错误计数和请求速率。Recording hourly metrics and using them to monitor average values such as average error counts and request rates.
  • 使用诊断工具调查潜在问题,如稍后在“诊断存储问题”一节中所述。Investigating potential issues using diagnostics tools as discussed later in the section "Diagnosing storage issues."

下图中的图表说明了对小时指标进行的求平均值操作为何会隐藏活动中的峰值。The charts in the following image illustrate how the averaging that occurs for hourly metrics can hide spikes in activity. 小时度量值似乎显示稳定的请求速率,而分钟度量值却显示了实际发生的波动。The hourly metrics appear to show a steady rate of requests, while the minute metrics reveal the fluctuations that are really taking place.

本节的剩余部分介绍应监视哪些度量值以及监视原因。The remainder of this section describes what metrics you should monitor and why.

监视服务运行状况Monitoring service health

可以使用 Azure 门户查看全球所有 Azure 区域中存储服务(及其他 Azure 服务)的运行状况。You can use the Azure portal to view the health of the Storage service (and other Azure services) in all the Azure regions around the world. 通过监视功能可以立即了解是否有不受你控制的问题正在影响应用程序所使用的区域中的存储服务。Monitoring enables you to see immediately if an issue outside of your control is affecting the Storage service in the region you use for your application.

此外,Azure 门户还可以提供影响各种 Azure 服务的事件的通知。The Azure portal can also provide notifications of incidents that affect the various Azure services. 注意:此信息以前已在 Azure 服务仪表板上与历史数据一起提供。Note: This information was previously available, along with historical data, on the Azure Service Dashboard.

虽然 Azure 门户从 Azure 数据中心内部收集运行状况信息(由内而外监视),但你也可以考虑采用由外而内的方法来生成定期从多个位置访问 Azure 托管的 Web 应用程序的综合事务。While the Azure portal collects health information from inside the Azure datacenters (inside-out monitoring), you could also consider adopting an outside-in approach to generate synthetic transactions that periodically access your Azure-hosted web application from multiple locations. Dynatrace 提供的服务是此由外而内方法的示例。The services offered by Dynatrace are examples of this outside-in approach.

监视容量Monitoring capacity

存储度量值仅存储 Blob 服务的容量度量值,因为 Blob 通常占所存储数据的最大比例(撰写本文时,尚不能使用存储度量值来监视表和队列的容量)。Storage Metrics only stores capacity metrics for the blob service because blobs typically account for the largest proportion of stored data (at the time of writing, it is not possible to use Storage Metrics to monitor the capacity of your tables and queues). 如果已为 Blob 服务启用监视,则可以在 $MetricsCapacityBlob 表中找到此数据。You can find this data in the $MetricsCapacityBlob table if you have enabled monitoring for the Blob service. 存储度量值每天记录一次此数据,然后可以使用 RowKey 的值来确定某行是否包含与用户数据(值 data )或分析数据(值 analytics )相关的实体。Storage Metrics records this data once per day, and you can use the value of the RowKey to determine whether the row contains an entity that relates to user data (value data) or analytics data (value analytics). 每个存储的实体均包含有关所用的存储量(Capacity,以字节为单位)、当前的容器数 (ContainerCount) 以及存储帐户中正在使用的 Blob 数 (ObjectCount) 的信息。Each stored entity contains information about the amount of storage used (Capacity measured in bytes) and the current number of containers (ContainerCount) and blobs (ObjectCount) in use in the storage account. 有关 $MetricsCapacityBlob 表中存储的容量度量值的详细信息,请参阅存储分析度量值表架构For more information about the capacity metrics stored in the $MetricsCapacityBlob table, see Storage Analytics Metrics Table Schema.

Note

应监视这些值以便获取“已接近存储帐户的容量限制”的早期警告。You should monitor these values for an early warning that you are approaching the capacity limits of your storage account. 在 Azure 门户中,可以添加警报规则,以便在聚合存储使用量超过或低于指定阈值时发出通知。In the Azure portal, you can add alert rules to notify you if aggregate storage use exceeds or falls below thresholds that you specify.

若要帮助估算各种存储对象(如 Blob)的大小,请参阅博客文章了解 Azure 存储计费 — 带宽、事务和容量For help estimating the size of various storage objects such as blobs, see the blog post Understanding Azure Storage Billing - Bandwidth, Transactions, and Capacity.

监视可用性Monitoring availability

应通过监视以下每小时或每分钟度量值表中的“可用性” 列中的值来监视存储帐户中存储服务的可用性: $MetricsHourPrimaryTransactionsBlob$MetricsHourPrimaryTransactionsTable$MetricsHourPrimaryTransactionsQueue$MetricsMinutePrimaryTransactionsBlob$MetricsMinutePrimaryTransactionsTable$MetricsMinutePrimaryTransactionsQueue$MetricsCapacityBlobYou should monitor the availability of the storage services in your storage account by monitoring the value in the Availability column in the hourly or minute metrics tables — $MetricsHourPrimaryTransactionsBlob, $MetricsHourPrimaryTransactionsTable, $MetricsHourPrimaryTransactionsQueue, $MetricsMinutePrimaryTransactionsBlob, $MetricsMinutePrimaryTransactionsTable, $MetricsMinutePrimaryTransactionsQueue, $MetricsCapacityBlob. 可用性列包含一个百分比值,指示该服务的可用性或该行所表示的 API 操作的可用性(RowKey 显示行是包含整体服务的度量值还是包含特定 API 操作的度量值)。The Availability column contains a percentage value that indicates the availability of the service or the API operation represented by the row (the RowKey shows if the row contains metrics for the service as a whole or for a specific API operation).

任何小于 100% 的值指示某些存储请求将失败。Any value less than 100% indicates that some storage requests are failing. 可以通过检查度量值数据中显示具有不同错误类型(如 ServerTimeoutError)的请求数的其他列来了解失败原因。You can see why they are failing by examining the other columns in the metrics data that show the numbers of requests with different error types such as ServerTimeoutError. 由于以下原因,应该会看到 Availability 暂时低于 100%:比如在该服务移动分区以更好地负载均衡请求时,出现暂时性服务器超时;客户端应用程序中的重试逻辑应处理此类间歇性情况。You should expect to see Availability fall temporarily below 100% for reasons such as transient server timeouts while the service moves partitions to better load-balance request; the retry logic in your client application should handle such intermittent conditions. Storage Analytics Logged Operations and Status Messages (存储分析记录的操作和状态消息)一文列出了存储度量值纳入其 可用性 计算中的事务类型。The article Storage Analytics Logged Operations and Status Messages lists the transaction types that Storage Metrics includes in its Availability calculation.

Azure 门户中,可以添加警报规则,以便在某项服务的“可用性” 低于指定阈值时通知用户。In the Azure portal, you can add alert rules to notify you if Availability for a service falls below a threshold that you specify.

本指南的“故障排除指南”一节介绍与可用性相关的一些常见存储服务问题。The "Troubleshooting guidance" section of this guide describes some common storage service issues related to availability.

监视性能Monitoring performance

若要监视存储服务的性能,可以使用每小时和每分钟度量值表中的以下度量值。To monitor the performance of the storage services, you can use the following metrics from the hourly and minute metrics tables.

  • AverageE2ELatencyAverageServerLatency 列中的值显示存储服务或 API 操作类型处理请求所需的平均时间。The values in the AverageE2ELatency and AverageServerLatency columns show the average time the storage service or API operation type is taking to process requests. AverageE2ELatency 是端到端延迟的度量值,除包括处理请求所需的时间外,还包括读取请求和发送响应所需的时间(因此包括请求到达存储服务后的网络延迟);AverageServerLatency 只是处理时间的度量值,因此不包括与客户端通信相关的任何网络延迟。AverageE2ELatency is a measure of end-to-end latency that includes the time taken to read the request and send the response in addition to the time taken to process the request (therefore includes network latency once the request reaches the storage service); AverageServerLatency is a measure of just the processing time and therefore excludes any network latency related to communicating with the client. 有关这两个值之间可能存在显著区别的原因的讨论,请参阅本指南后面的“度量值显示高 AverageE2ELatency 和低 AverageServerLatency”一节。See the section "Metrics show high AverageE2ELatency and low AverageServerLatency" later in this guide for a discussion of why there might be a significant difference between these two values.
  • TotalIngressTotalEgress 列中的值显示进出存储服务或通过特定 API 操作类型的数据总量(以字节为单位)。The values in the TotalIngress and TotalEgress columns show the total amount of data, in bytes, coming in to and going out of your storage service or through a specific API operation type.
  • TotalRequests 列中的值显示存储服务的 API 操作正在接收的请求总数。The values in the TotalRequests column show the total number of requests that the storage service of API operation is receiving. TotalRequests 是存储服务收到的请求总数。TotalRequests is the total number of requests that the storage service receives.

通常,对于作为出现需要调查的问题的指示器的任何这些值,将监视其意外更改。Typically, you will monitor for unexpected changes in any of these values as an indicator that you have an issue that requires investigation.

Azure 门户中,可以添加警报规则,以便在此服务的任何性能度量值低于或超过指定阈值时通知用户。In the Azure portal, you can add alert rules to notify you if any of the performance metrics for this service fall below or exceed a threshold that you specify.

本指南的“故障排除指南”一节介绍与性能相关的一些常见存储服务问题。The "Troubleshooting guidance" section of this guide describes some common storage service issues related to performance.

诊断存储空间问题Diagnosing storage issues

有多种方式可能让你注意到应用程序有问题,这包括:There are a number of ways that you might become aware of a problem or issue in your application, including:

  • 导致应用程序崩溃或停止工作的严重故障。A major failure that causes the application to crash or to stop working.
  • 正在监视的度量值与基准值相比发生重大更改,如上一节“监视存储服务”中所述。Significant changes from baseline values in the metrics you are monitoring as described in the previous section "Monitoring your storage service."
  • 根据应用程序的用户报告,某个特定操作未按预期完成,或者某项功能无法正常工作。Reports from users of your application that some particular operation didn't complete as expected or that some feature is not working.
  • 在日志文件中或通过某种其他通知方法显示应用程序中生成的错误。Errors generated within your application that appear in log files or through some other notification method.

通常,与 Azure 存储服务相关的问题分为以下四大类之一:Typically, issues related to Azure storage services fall into one of four broad categories:

  • 应用程序出现性能问题,该问题由用户报告或由性能度量值的更改显示。Your application has a performance issue, either reported by your users, or revealed by changes in the performance metrics.
  • 一个或多个区域中的 Azure 存储基础结构存在问题。There is a problem with the Azure Storage infrastructure in one or more regions.
  • 应用程序遇到错误,该错误由用户报告或由你监视的其中一个错误计数度量值增加显示。Your application is encountering an error, either reported by your users, or revealed by an increase in one of the error count metrics you monitor.
  • 在开发和测试期间,可能使用的是本地存储模拟器;你可能会遇到一些与使用存储模拟器特别相关的问题。During development and test, you may be using the local storage emulator; you may encounter some issues that relate specifically to usage of the storage emulator.

以下各节概述了诊断和排查这四大类中的每一类问题时应遵循的步骤。The following sections outline the steps you should follow to diagnose and troubleshoot issues in each of these four categories. 在本指南后面的“故障排除指南”一节提供有关您可能会遇到的一些常见问题的更多详细信息。The section "Troubleshooting guidance" later in this guide provides more detail for some common issues you may encounter.

服务运行状况问题Service health issues

服务运行状况问题通常不受你控制。Service health issues are typically outside of your control. Azure 门户提供了有关 Azure 服务(包括存储服务)当前存在的任何问题的信息。The Azure portal provides information about any ongoing issues with Azure services including storage services. 如果在创建存储帐户时选择了启用读取访问异地冗余存储,则在主位置中的数据不可用时,应用程序可以暂时切换到辅助位置中的只读副本。If you opted for Read-Access Geo-Redundant Storage when you created your storage account, then if your data becomes unavailable in the primary location, your application can switch temporarily to the read-only copy in the secondary location. 若要从辅助位置读取数据,应用程序必须能够在使用主存储位置和辅助存储位置之间切换,并能够在降低的功能模式下使用只读数据正常工作。To read from the secondary, your application must be able to switch between using the primary and secondary storage locations, and be able to work in a reduced functionality mode with read-only data. 使用 Azure 存储客户端库,可以定义重试策略,以便在从主存储读取失败时可以从辅助存储读取。The Azure Storage Client libraries allow you to define a retry policy that can read from secondary storage in case a read from primary storage fails. 应用程序还需要注意辅助位置中的数据是否最终一致。Your application also needs to be aware that the data in the secondary location is eventually consistent. 有关详细信息,请参阅博客文章 Azure 存储冗余选项和读取访问异地冗余存储For more information, see the blog post Azure Storage Redundancy Options and Read Access Geo Redundant Storage.

性能问题Performance issues

应用程序的性能可能是主观判定的,尤其是从用户角度来看,更是如此。The performance of an application can be subjective, especially from a user perspective. 因此,请务必设置可用于帮助你确定是否可能存在性能问题的基准指标。Therefore, it is important to have baseline metrics available to help you identify where there might be a performance issue. 从客户端应用程序角度来看,有许多因素可能会影响 Azure 存储服务的性能。Many factors might affect the performance of an Azure storage service from the client application perspective. 这些因素可能在存储服务、客户端或网络基础结构中运作;因此必须设置策略来确定性能问题的源。These factors might operate in the storage service, in the client, or in the network infrastructure; therefore it is important to have a strategy for identifying the origin of the performance issue.

在通过度量值确定性能问题的可能根源位置后,可以使用日志文件查找详细信息以便进一步诊断并排查该问题。After you have identified the likely location of the cause of the performance issue from the metrics, you can then use the log files to find detailed information to diagnose and troubleshoot the problem further.

在本指南中后面的“故障排除指南”部分将提供有关可能会遇到的一些常见性能相关问题的详细信息。The section "Troubleshooting guidance" later in this guide provides more information about some common performance-related issues you may encounter.

诊断错误Diagnosing errors

应用程序用户可能会向你通知客户端应用程序报告的错误。Users of your application may notify you of errors reported by the client application. 存储度量值还会记录来自存储服务的不同错误类型(如 NetworkErrorClientTimeoutErrorAuthorizationError)的计数。Storage Metrics also records counts of different error types from your storage services such as NetworkError, ClientTimeoutError, or AuthorizationError. 虽然存储指标仅记录不同错误类型的计数,但可以通过检查服务器端日志、客户端日志和网络日志来获取有关单个请求的详细信息。While Storage Metrics only records counts of different error types, you can obtain more detail about individual requests by examining server-side, client-side, and network logs. 通常,存储服务返回的 HTTP 状态代码会指示请求失败的原因。Typically, the HTTP status code returned by the storage service will give an indication of why the request failed.

Note

请记住,应该会看到一些间歇性错误:例如,因暂时性的网络状况导致的错误或应用程序错误。Remember that you should expect to see some intermittent errors: for example, errors due to transient network conditions, or application errors.

以下资源对了解与存储相关的状态和错误代码很有帮助:The following resources are useful for understanding storage-related status and error codes:

存储模拟器问题Storage emulator issues

Azure SDK 提供了一个存储模拟器,可以在开发工作站上运行它。The Azure SDK includes a storage emulator you can run on a development workstation. 此模拟器可模拟 Azure 存储服务的大多数行为,因此在开发和测试期间很有用,让用户无需 Azure 订阅和 Azure 存储帐户即可运行使用 Azure 存储服务的应用程序。This emulator simulates most of the behavior of the Azure storage services and is useful during development and test, enabling you to run applications that use Azure storage services without the need for an Azure subscription and an Azure storage account.

本指南的“故障排除指南”一节介绍使用存储模拟器时遇到的一些常见问题。The "Troubleshooting guidance" section of this guide describes some common issues encountered using the storage emulator.

存储日志记录工具Storage logging tools

存储日志记录为 Azure 存储帐户中的存储请求提供服务器端日志记录功能。Storage Logging provides server-side logging of storage requests in your Azure storage account. 有关如何启用服务器端日志记录和访问日志数据的详细信息,请参阅 Enabling Storage Logging and Accessing Log Data(启用存储日志记录和访问日志数据)。For more information about how to enable server-side logging and access the log data, see Enabling Storage Logging and Accessing Log Data.

使用 .NET 存储客户端库可以收集与应用程序执行的存储操作相关的客户端日志数据。The Storage Client Library for .NET enables you to collect client-side log data that relates to storage operations performed by your application. 有关详细信息,请参阅 Client-side Logging with the .NET Storage Client Library(使用 .NET 存储客户端库的客户端日志记录)。For more information, see Client-side Logging with the .NET Storage Client Library.

Note

在某些情况下(如 SAS 授权失败),用户可能会报告一个错误,而你可能在服务器端存储日志中未找到该错误所对应的请求数据。In some circumstances (such as SAS authorization failures), a user may report an error for which you can find no request data in the server-side Storage logs. 可以使用存储客户端库的日志记录功能调查该问题的原因是否出在客户端上,或者使用网络监视工具调查网络。You can use the logging capabilities of the Storage Client Library to investigate if the cause of the issue is on the client or use network monitoring tools to investigate the network.

使用网络日志记录工具Using network logging tools

可以捕获客户端和服务器之间的流量,以便提供有关客户端和服务器正在交换的数据以及底层网络状况的详细信息。You can capture the traffic between the client and server to provide detailed information about the data the client and server are exchanging and the underlying network conditions. 有用的网络日志记录工具包括:Useful network logging tools include:

在许多情况下,通过存储日志记录和存储客户端库记录的日志数据已足以诊断问题,但在某些情况下,可能需要更详细的信息,而这些网络日志记录工具可以提供这些信息。In many cases, the log data from Storage Logging and the Storage Client Library will be sufficient to diagnose an issue, but in some scenarios, you may need the more detailed information that these network logging tools can provide. 例如,使用 Fiddler 查看 HTTP 和 HTTPS 消息时,可以查看发往和来自存储服务的标头和负载数据,这使你能够检查客户端应用程序如何重试存储操作。For example, using Fiddler to view HTTP and HTTPS messages enables you to view header and payload data sent to and from the storage services, which would enable you to examine how a client application retries storage operations. 协议分析器(例如 Wireshark)运行在数据包级别,这使你能够查看 TCP 数据,从而可以排查丢失的数据包和连接问题。Protocol analyzers such as Wireshark operate at the packet level enabling you to view TCP data, which would enable you to troubleshoot lost packets and connectivity issues. Message Analyzer 可以在 HTTP 和 TCP 层上运行。Message Analyzer can operate at both HTTP and TCP layers.

端到端跟踪End-to-end tracing

使用各种日志文件的端到端跟踪是一项有用的技术,用于调查潜在的问题。End-to-end tracing using a variety of log files is a useful technique for investigating potential issues. 可以使用度量数据中的日期/时间信息来指示在日志文件中查找有助于排查问题的详细信息的起始位置。You can use the date/time information from your metrics data as an indication of where to start looking in the log files for the detailed information that will help you troubleshoot the issue.

关联日志数据Correlating log data

查看客户端应用程序、网络跟踪、服务器端存储日志记录中的日志时,能够关联不同日志文件中的请求至关重要。When viewing logs from client applications, network traces, and server-side storage logging it is critical to be able to correlate requests across the different log files. 日志文件包含许多不同的字段,这些字段可用作关联标识符。The log files include a number of different fields that are useful as correlation identifiers. 客户端请求 ID 是最有用的字段,用于关联不同日志中的条目。The client request ID is the most useful field to use to correlate entries in the different logs. 但有时,使用服务器请求 ID 或时间戳会很有用。However sometimes, it can be useful to use either the server request ID or timestamps. 以下各节提供了有关这些选项的更多详细信息。The following sections provide more details about these options.

客户端请求 IDClient request ID

存储客户端库自动为每个请求生成唯一的客户端请求 ID。The Storage Client Library automatically generates a unique client request ID for every request.

  • 在存储客户端库创建的客户端日志中,客户端请求 ID 显示在与请求相关的每个日志项的“客户端请求 ID”字段中 。In the client-side log that the Storage Client Library creates, the client request ID appears in the Client Request ID field of every log entry relating to the request.
  • 在网络跟踪(如 Fiddler 捕获的跟踪)中,客户端请求 ID 在请求消息中显示为 x-ms-client-request-id HTTP 标头值 。In a network trace such as one captured by Fiddler, the client request ID is visible in request messages as the x-ms-client-request-id HTTP header value.
  • 在服务器端存储日志记录日志中,客户端请求 ID 显示在“客户端请求 ID”列中。In the server-side Storage Logging log, the client request ID appears in the Client request ID column.

Note

多个请求可共享同一客户端请求 ID,因为客户端可分配此值(尽管存储客户端库自动分配一个新值)。It is possible for multiple requests to share the same client request ID because the client can assign this value (although the Storage Client Library assigns a new value automatically). 当客户端重试时,所有尝试都共享相同的客户端请求 ID。When the client retries, all attempts share the same client request ID. 如果从客户端发送批处理,该批处理具有一个客户端请求 ID。In the case of a batch sent from the client, the batch has a single client request ID.

服务器请求 IDServer request ID

存储服务会自动生成服务器请求 ID。The storage service automatically generates server request IDs.

  • 在服务器端存储日志记录日志中,服务器请求 ID 显示在“请求 ID 标头”列中 。In the server-side Storage Logging log, the server request ID appears the Request ID header column.
  • 在网络跟踪(如 Fiddler 捕获的跟踪)中,服务器请求 ID 在响应消息中显示为 x-ms-request-id HTTP 标头值 。In a network trace such as one captured by Fiddler, the server request ID appears in response messages as the x-ms-request-id HTTP header value.
  • 在存储客户端库创建的客户端日志中,服务器请求 ID 出现在显示服务器响应详细信息的日志项的“操作文本”列中 。In the client-side log that the Storage Client Library creates, the server request ID appears in the Operation Text column for the log entry showing details of the server response.

Note

存储服务始终为它接收的每个请求分配唯一的服务器请求 ID,因此客户端进行的每次重试尝试和批处理中包含的每个操作均使用唯一的服务器请求 ID。The storage service always assigns a unique server request ID to every request it receives, so every retry attempt from the client and every operation included in a batch has a unique server request ID.

如果存储客户端库在客户端上引发 StorageException ,则 RequestInformation 属性将包含 RequestResult 对象(其中包含 ServiceRequestID 属性)。If the Storage Client Library throws a StorageException in the client, the RequestInformation property contains a RequestResult object that includes a ServiceRequestID property. 也可以通过 OperationContext 实例访问 RequestResult 对象。You can also access a RequestResult object from an OperationContext instance.

下面的代码示例演示如何通过附加 OperationContext 对象(向存储服务发出的请求)设置自定义 ClientRequestId 值。The code sample below demonstrates how to set a custom ClientRequestId value by attaching an OperationContext object the request to the storage service. 它还演示了如何从响应消息中检索 ServerRequestId 值。It also shows how to retrieve the ServerRequestId value from the response message.

//Parse the connection string for the storage account.
const string ConnectionString = "DefaultEndpointsProtocol=https;AccountName=account-name;AccountKey=account-key;EndpointSuffix=core.chinacloudapi.cn";
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConnectionString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();

// Create an Operation Context that includes custom ClientRequestId string based on constants defined within the application along with a Guid.
OperationContext oc = new OperationContext();
oc.ClientRequestID = String.Format("{0} {1} {2} {3}", HOSTNAME, APPNAME, USERID, Guid.NewGuid().ToString());

try
{
    CloudBlobContainer container = blobClient.GetContainerReference("democontainer");
    ICloudBlob blob = container.GetBlobReferenceFromServer("testImage.jpg", null, null, oc);  
    var downloadToPath = string.Format("./{0}", blob.Name);
    using (var fs = File.OpenWrite(downloadToPath))
    {
        blob.DownloadToStream(fs, null, null, oc);
        Console.WriteLine("\t Blob downloaded to file: {0}", downloadToPath);
    }
}
catch (StorageException storageException)
{
    Console.WriteLine("Storage exception {0} occurred", storageException.Message);
    // Multiple results may exist due to client side retry logic - each retried operation will have a unique ServiceRequestId
    foreach (var result in oc.RequestResults)
    {
            Console.WriteLine("HttpStatus: {0}, ServiceRequestId {1}", result.HttpStatusCode, result.ServiceRequestID);
    }
}

时间戳Timestamps

还可以使用时间戳来查找相关日志项,但应注意客户端和服务器之间可能存在的时钟偏差。You can also use timestamps to locate related log entries, but be careful of any clock skew between the client and server that may exist. 在客户端上基于时间戳搜索服务器端的相应条目时,应加/减 15 分钟。Search plus or minus 15 minutes for matching server-side entries based on the timestamp on the client. 请记住,包含指标的 Blob 的 Blob 元数据指示 Blob 中存储的指标的时间范围。Remember that the blob metadata for the blobs containing metrics indicates the time range for the metrics stored in the blob. 如果在同一分钟或小时内使用多个指标 Blob,则此时间范围会很有用。This time range is useful if you have many metrics blobs for the same minute or hour.

故障排除指南Troubleshooting guidance

本节将帮助你诊断和排查在使用 Azure 存储服务时应用程序可能会遇到的一些常见问题。This section will help you with the diagnosis and troubleshooting of some of the common issues your application may encounter when using the Azure storage services. 使用下面的列表来查找与具体问题相关的信息。Use the list below to locate the information relevant to your specific issue.

故障排除决策树Troubleshooting Decision Tree


问题是否与其中一个存储服务的性能相关?Does your issue relate to the performance of one of the storage services?


问题是否与其中一个存储服务的可用性相关?Does your issue relate to the availability of one of the storage services?


客户端应用程序是否从存储服务收到 HTTP 4XX(如 404)响应?Is your client application receiving an HTTP 4XX (such as 404) response from a storage service?


度量值显示低 PercentSuccess,或者分析日志项包含事务状态为 ClientOtherErrors 的操作Metrics show low PercentSuccess or analytics log entries have operations with transaction status of ClientOtherErrors


容量度量值显示存储容量使用量意外增加Capacity metrics show an unexpected increase in storage capacity usage


[遇到具有大量附加 VHD 的虚拟机意外重新启动][You are experiencing unexpected reboots of Virtual Machines that have a large number of attached VHDs]


问题是由于使用存储模拟器进行开发或测试而导致Your issue arises from using the storage emulator for development or test


安装用于 .NET 的 Azure SDK 时遇到问题You are encountering problems installing the Azure SDK for .NET


你遇到了其他存储服务问题You have a different issue with a storage service


度量值显示高 AverageE2ELatency 和低 AverageServerLatencyMetrics show high AverageE2ELatency and low AverageServerLatency

下面来自 Azure 门户监视工具的插图显示了一个示例,其中 AverageE2ELatency 明显高于 AverageServerLatencyThe illustration below from the Azure portal monitoring tool shows an example where the AverageE2ELatency is significantly higher than the AverageServerLatency.

存储服务仅对成功的请求计算指标 AverageE2ELatency,与 AverageServerLatency 不同,它包括客户端发送数据及从存储服务接收确认所需的时间。The storage service only calculates the metric AverageE2ELatency for successful requests and, unlike AverageServerLatency, includes the time the client takes to send the data and receive acknowledgement from the storage service. 因此,AverageE2ELatencyAverageServerLatency 之间的差异可能是由于客户端应用程序响应速度慢,或者是由网络情况导致的。Therefore, a difference between AverageE2ELatency and AverageServerLatency could be either due to the client application being slow to respond, or due to conditions on the network.

Note

还可以在存储日志记录日志数据中查看单个存储操作的 E2ELatencyServerLatencyYou can also view E2ELatency and ServerLatency for individual storage operations in the Storage Logging log data.

调查客户端的性能问题Investigating client performance issues

客户端响应速度慢的可能原因包括:可用连接数或可用线程数有限,或者 CPU、内存或网络带宽等资源不足。Possible reasons for the client responding slowly include having a limited number of available connections or threads, or being low on resources such as CPU, memory or network bandwidth. 可以通过以下方式解决此问题:修改客户端代码使其更高效(例如,对存储服务使用异步调用),或者使用更大的虚拟机(包含更多内核和更多内存)。You may be able to resolve the issue by modifying the client code to be more efficient (for example by using asynchronous calls to the storage service), or by using a larger Virtual Machine (with more cores and more memory).

对于表和队列服务,Nagle 算法也可能会导致高 AverageE2ELatency(与 AverageServerLatency 相比):有关详细信息,请参阅博客文章 Nagle’s Algorithm is Not Friendly towards Small Requests(Nagle 算法对小型请求不友好)。For the table and queue services, the Nagle algorithm can also cause high AverageE2ELatency as compared to AverageServerLatency: for more information, see the post Nagle's Algorithm is Not Friendly towards Small Requests. 可以通过使用 System.Net 命名空间中的 ServicePointManager 类在代码中禁用 Nagle 算法。You can disable the Nagle algorithm in code by using the ServicePointManager class in the System.Net namespace. 应在应用程序中调用表或队列服务之前执行此操作,因为这样做不会影响已打开的连接。You should do this before you make any calls to the table or queue services in your application since this does not affect connections that are already open. 下面的示例来自辅助角色中的 Application_Start 方法。The following example comes from the Application_Start method in a worker role.

var storageAccount = CloudStorageAccount.Parse(connStr);
ServicePoint tableServicePoint = ServicePointManager.FindServicePoint(storageAccount.TableEndpoint);
tableServicePoint.UseNagleAlgorithm = false;
ServicePoint queueServicePoint = ServicePointManager.FindServicePoint(storageAccount.QueueEndpoint);
queueServicePoint.UseNagleAlgorithm = false;

应查看客户端日志以了解客户端应用程序正在提交多少个请求,并检查客户端中与 .NET 相关的常规性能瓶颈(如 CPU、.NET 垃圾回收、网络利用率或内存)。You should check the client-side logs to see how many requests your client application is submitting, and check for general .NET related performance bottlenecks in your client such as CPU, .NET garbage collection, network utilization, or memory. 作为排查 .NET 客户端应用程序问题的起点,请参阅调试、跟踪和分析As a starting point for troubleshooting .NET client applications, see Debugging, Tracing, and Profiling.

调查网络延迟问题Investigating network latency issues

通常,因网络导致的高端到端延迟是由暂时状况导致的。Typically, high end-to-end latency caused by the network is due to transient conditions. 可以使用工具(如 Wireshark 或 Microsoft Message Analyzer)调查临时和持久网络问题,例如丢弃数据包。You can investigate both transient and persistent network issues such as dropped packets by using tools such as Wireshark or Microsoft Message Analyzer.

有关使用 Wireshark 排查网络问题的详细信息,请参阅附录 2:使用 Wireshark 捕获网络流量For more information about using Wireshark to troubleshoot network issues, see "Appendix 2: Using Wireshark to capture network traffic."

有关使用 Microsoft Message Analyzer 排查网络问题的详细信息,请参阅附录 3:使用 Microsoft Message Analyzer 捕获网络流量For more information about using Microsoft Message Analyzer to troubleshoot network issues, see "Appendix 3: Using Microsoft Message Analyzer to capture network traffic."

度量值显示低 AverageE2ELatency 和低 AverageServerLatency,但客户端遇到高延迟Metrics show low AverageE2ELatency and low AverageServerLatency but the client is experiencing high latency

在这种情况下,最可能的原因是到达存储服务的存储请求出现延迟。In this scenario, the most likely cause is a delay in the storage requests reaching the storage service. 应调查来自客户端的请求为什么未到达 Blob 服务。You should investigate why requests from the client are not making it through to the blob service.

客户端延迟发送请求的一个可能原因是,可用连接数或可用线程数有限。One possible reason for the client delaying sending requests is that there are a limited number of available connections or threads.

还应检查客户端是否正在执行多次重试,如果是这种情况,请调查原因。Also check whether the client is performing multiple retries, and investigate the reason if it is. 要确定客户端是否正在执行多次重试,可以:To determine whether the client is performing multiple retries, you can:

  • 检查存储分析日志。Examine the Storage Analytics logs. 如果发生多次重试,会看到多个操作具有相同的客户端请求 ID,但却具有不同的服务器请求 ID。If multiple retries are happening, you will see multiple operations with the same client request ID but with different server request IDs.
  • 检查客户端日志。Examine the client logs. 详细日志记录会指示重试已发生过。Verbose logging will indicate that a retry has occurred.
  • 对代码进行调试,并查看与请求关联的 OperationContext 对象的属性。Debug your code, and check the properties of the OperationContext object associated with the request. 如果该操作已重试过,则 RequestResults 属性会包括多个唯一的服务器请求 ID。If the operation has retried, the RequestResults property will include multiple unique server request IDs. 此外,还可以检查每个请求的开始和结束时间。You can also check the start and end times for each request. 有关详细信息,请参阅 服务器请求 ID部分中的代码示例。For more information, see the code sample in the section Server request ID.

如果客户端没有问题,则应调查潜在的网络问题,例如数据包丢失。If there are no issues in the client, you should investigate potential network issues such as packet loss. 可以使用工具(如 Wireshark 或 Microsoft Message Analyzer)调查网络问题。You can use tools such as Wireshark or Microsoft Message Analyzer to investigate network issues.

有关使用 Wireshark 排查网络问题的详细信息,请参阅附录 2:使用 Wireshark 捕获网络流量For more information about using Wireshark to troubleshoot network issues, see "Appendix 2: Using Wireshark to capture network traffic."

有关使用 Microsoft Message Analyzer 排查网络问题的详细信息,请参阅附录 3:使用 Microsoft Message Analyzer 捕获网络流量For more information about using Microsoft Message Analyzer to troubleshoot network issues, see "Appendix 3: Using Microsoft Message Analyzer to capture network traffic."

度量值显示高 AverageServerLatencyMetrics show high AverageServerLatency

如果 blob 下载请求出现高 AverageServerLatency,则应使用存储日志记录日志来了解对于同一 blob(或一组 blob)是否存在重复的请求。In the case of high AverageServerLatency for blob download requests, you should use the Storage Logging logs to see if there are repeated requests for the same blob (or set of blobs). 对于 Blob 上传请求,应调查客户端正在使用的数据块大小(例如,小于 64 K 的数据块大小可能会导致开销,除非读取操作也在小于 64 K 的区块中进行),以及是否有多个客户端正在并行将数据块上传到同一 Blob。For blob upload requests, you should investigate what block size the client is using (for example, blocks less than 64 K in size can result in overheads unless the reads are also in less than 64 K chunks), and if multiple clients are uploading blocks to the same blob in parallel. 还应检查每分钟度量值以了解导致超出每秒可伸缩性目标的请求数峰值:另请参阅“度量值显示 PercentTimeoutError 增加”。You should also check the per-minute metrics for spikes in the number of requests that result in exceeding the per second scalability targets: also see "Metrics show an increase in PercentTimeoutError."

如果当对于同一 Blob(或一组 Blob)存在重复的请求时,会看到 Blob 下载请求显示高 AverageServerLatency,则应考虑使用 Azure 缓存或 Azure 内容分发网络 (CDN) 缓存这些 Blob。If you are seeing high AverageServerLatency for blob download requests when there are repeated requests the same blob or set of blobs, then you should consider caching these blobs using Azure Cache or the Azure Content Delivery Network (CDN). 对于上传请求,你可以通过使用较大的数据块大小来提高吞吐量。For upload requests, you can improve the throughput by using a larger block size. 对于表查询,也可以在执行相同的查询操作并且数据不会频繁更改的客户端上实施客户端缓存。For queries to tables, it is also possible to implement client-side caching on clients that perform the same query operations and where the data doesn't change frequently.

AverageServerLatency 值也可能是设计欠佳的表或查询的症状,它会导致扫描操作或执行追加/前面预置反模式。High AverageServerLatency values can also be a symptom of poorly designed tables or queries that result in scan operations or that follow the append/prepend anti-pattern. 有关详细信息,请参阅“度量值显示 PercentThrottlingError 增加”。For more information, see "Metrics show an increase in PercentThrottlingError".

Note

可以在此处找到一份全面的性能清单:Azure 存储性能和可伸缩性清单You can find a comprehensive checklist performance checklist here: Azure Storage Performance and Scalability Checklist.

队列上的消息传递出现意外延迟You are experiencing unexpected delays in message delivery on a queue

如果在应用程序将某一消息添加到队列的时间与可从队列中读取该消息的时间之间出现延迟,应采取以下步骤诊断此问题:If you are experiencing a delay between the time an application adds a message to a queue and the time it becomes available to read from the queue, then you should take the following steps to diagnose the issue:

  • 验证应用程序是否成功地将该消息添加到队列。Verify the application is successfully adding the messages to the queue. 检查应用程序在成功添加前是否未多次重试 AddMessage 方法。Check that the application is not retrying the AddMessage method several times before succeeding. 存储客户端库日志会显示存储操作的任何重复重试。The Storage Client Library logs will show any repeated retries of storage operations.
  • 验证将消息添加到队列的辅助角色与从队列读取该消息的辅助角色之间不存在任何时钟偏差,使得处理看起来就像出现延迟。Verify there is no clock skew between the worker role that adds the message to the queue and the worker role that reads the message from the queue that makes it appear as if there is a delay in processing.
  • 检查从队列中读取该消息的辅助角色是否出现故障。Check if the worker role that reads the messages from the queue is failing. 如果队列客户端调用了 GetMessage 方法,但无法响应确认消息,则该消息将一直在队列中保持不可见,直到 invisibilityTimeout 期限过期。If a queue client calls the GetMessage method but fails to respond with an acknowledgement, the message will remain invisible on the queue until the invisibilityTimeout period expires. 此时,该消息可供再次处理。At this point, the message becomes available for processing again.
  • 检查队列长度是否随着时间的推移不断增长。Check if the queue length is growing over time. 如果没有足够多的辅助角色可用于处理其他辅助角色放入队列的所有消息,会出现这种情况。This can occur if you do not have sufficient workers available to process all of the messages that other workers are placing on the queue. 此外,还应检查指标以了解删除请求是否失败,并应查看消息的出队计数,该计数可能指示删除消息的重复失败尝试次数。Also check the metrics to see if delete requests are failing and the dequeue count on messages, which might indicate repeated failed attempts to delete the message.
  • 检查存储日志记录日志以查找在长于平常的时间段内具有高于预期的 E2ELatencyServerLatency 值的任何队列操作。Examine the Storage Logging logs for any queue operations that have higher than expected E2ELatency and ServerLatency values over a longer period of time than usual.

度量值显示 PercentThrottlingError 增加Metrics show an increase in PercentThrottlingError

当超出存储服务的可伸缩性目标时,会发生限制错误。Throttling errors occur when you exceed the scalability targets of a storage service. 存储服务进行限制是为了确保没有单个客户端或租户可以在损害其他客户端或租户的情况下使用服务。The storage service throttles to ensure that no single client or tenant can use the service at the expense of others. 有关存储帐户的可伸缩性目标和存储帐户中的分区的性能目标的详细信息,请参阅 Azure 存储可伸缩性和性能目标For more information, see Azure Storage Scalability and Performance Targets for details on scalability targets for storage accounts and performance targets for partitions within storage accounts.

如果 PercentThrottlingError 度量值显示失败并出现限制错误的请求百分比增加,则需要调查以下两种情况之一:If the PercentThrottlingError metric show an increase in the percentage of requests that are failing with a throttling error, you need to investigate one of two scenarios:

通常在存储请求数增加时,或者在最初对应用程序进行负载测试时,会出现 PercentThrottlingError 增加的情况。An increase in PercentThrottlingError often occurs at the same time as an increase in the number of storage requests, or when you are initially load testing your application. 这也可能表现为在客户端中进行存储操作时出现“503 服务器忙”或“500 操作超时”HTTP 状态消息。This may also manifest itself in the client as "503 Server Busy" or "500 Operation Timeout" HTTP status messages from storage operations.

PercentThrottlingError 暂时增加Transient increase in PercentThrottlingError

如果看到 PercentThrottlingError 的值达到峰值的时间与应用程序活动的高峰期保持一致,则应在客户端中对重试实施指数(而非线性)回退策略。If you are seeing spikes in the value of PercentThrottlingError that coincide with periods of high activity for the application, you implement an exponential (not linear) back-off strategy for retries in your client. 回退重试会减少分区上的即时负载,并帮助应用程序消除流量峰值。Back-off retries reduce the immediate load on the partition and help your application to smooth out spikes in traffic. 有关如何使用存储客户端库实现重试策略的详细信息,请参阅 Microsoft.Azure.Storage.RetryPolicies 命名空间For more information about how to implement retry policies using the Storage Client Library, see the Microsoft.Azure.Storage.RetryPolicies namespace.

Note

可能也会看到 PercentThrottlingError 的值达到峰值的时间与应用程序活动的高峰期不一致:这种情况最可能的原因是存储服务正在移动分区以改进负载均衡。You may also see spikes in the value of PercentThrottlingError that do not coincide with periods of high activity for the application: the most likely cause here is the storage service moving partitions to improve load balancing.

PercentThrottlingError 错误永久增加Permanent increase in PercentThrottlingError error

事务量永久增加后,或者对应用程序进行初始负载测试时,如果看到 PercentThrottlingError 的值一直很高,则需要评估应用程序使用存储分区的方式,以及它是否接近存储帐户的可伸缩性目标。If you are seeing a consistently high value for PercentThrottlingError following a permanent increase in your transaction volumes, or when you are performing your initial load tests on your application, then you need to evaluate how your application is using storage partitions and whether it is approaching the scalability targets for a storage account. 例如,如果在一个队列(它视为单个分区)上看到限制错误,则应考虑使用其他队列以将事务分布到多个分区上。For example, if you are seeing throttling errors on a queue (which counts as a single partition), then you should consider using additional queues to spread the transactions across multiple partitions. 如果在表上看到限制错误,则需要考虑使用不同的分区方案,以便使用各种分区键值将事务分布到多个分区。If you are seeing throttling errors on a table, you need to consider using a different partitioning scheme to spread your transactions across multiple partitions by using a wider range of partition key values. 此问题的一个常见原因是由于“前面预置/追加”反模式,在该模式下用户选择日期作为分区键,并将某一天的所有数据都写入到一个分区:负载过轻,这可能会导致写入瓶颈。One common cause of this issue is the prepend/append anti-pattern where you select the date as the partition key and then all data on a particular day is written to one partition: under load, this can result in a write bottleneck. 应考虑使用不同的分区设计,或者评估使用 Blob 存储是否可能是更好的解决方案。Either consider a different partitioning design or evaluate whether using blob storage might be a better solution. 还应该检查出现限制是否是由于流量达到峰值而导致的,并应调查平滑处理请求模式的方式。Also check whether throttling is occurring as a result of spikes in your traffic and investigate ways of smoothing your pattern of requests.

如果将事务分布到多个分区中,仍必须注意为存储帐户设置的伸缩性限制。If you distribute your transactions across multiple partitions, you must still be aware of the scalability limits set for the storage account. 例如,如果使用 10 个队列(每个队列每秒最多处理 2,000 条 1KB 消息),将达到存储帐户每秒 20,000 条消息的总体限制。For example, if you used ten queues each processing the maximum of 2,000 1KB messages per second, you will be at the overall limit of 20,000 messages per second for the storage account. 如果每秒需要处理的实体数超过 20,000 个,则应考虑使用多个存储帐户。If you need to process more than 20,000 entities per second, you should consider using multiple storage accounts. 还应牢记,请求和实体的大小对存储服务何时限制客户端有影响:如果使用较大的请求和实体,则可能会更快地受到限制。You should also bear in mind that the size of your requests and entities has an impact on when the storage service throttles your clients: if you have larger requests and entities, you may be throttled sooner.

低效的查询设计也会导致达到表分区的伸缩性限制。Inefficient query design can also cause you to hit the scalability limits for table partitions. 例如,一个使用筛选器的查询仅选择分区中百分之一的实体,但却扫描该分区中的所有实体,从而将需要访问每个实体。For example, a query with a filter that only selects one percent of the entities in a partition but that scans all the entities in a partition will need to access each entity. 读取的每个实体均会计入该分区中的事务总数;因此,很容易就会达到可伸缩性目标。Every entity read will count towards the total number of transactions in that partition; therefore, you can easily reach the scalability targets.

Note

性能测试应显示应用程序中的任何低效查询设计。Your performance testing should reveal any inefficient query designs in your application.

度量值显示 PercentTimeoutError 增加Metrics show an increase in PercentTimeoutError

度量值显示其中一个存储服务的 PercentTimeoutError 增加。Your metrics show an increase in PercentTimeoutError for one of your storage services. 同时,客户端将收到存储操作发出的大量“500 操作超时”HTTP 状态消息。At the same time, the client receives a high volume of "500 Operation Timeout" HTTP status messages from storage operations.

Note

当存储服务通过将分区移到新服务器来对请求进行负载均衡时,可能会临时地看到超时错误。You may see timeout errors temporarily as the storage service load balances requests by moving a partition to a new server.

PercentTimeoutError 指标是以下指标的聚合:ClientTimeoutErrorAnonymousClientTimeoutErrorSASClientTimeoutErrorServerTimeoutErrorAnonymousServerTimeoutErrorSASServerTimeoutErrorThe PercentTimeoutError metric is an aggregation of the following metrics: ClientTimeoutError, AnonymousClientTimeoutError, SASClientTimeoutError, ServerTimeoutError, AnonymousServerTimeoutError, and SASServerTimeoutError.

服务器超时是由于服务器上的错误导致的。The server timeouts are caused by an error on the server. 客户端超时之所以发生是因为服务器上的操作已超出客户端指定的超时值;例如,使用存储客户端库的客户端可以使用 QueueRequestOptions 类的 ServerTimeout 属性为操作设置超时值。The client timeouts happen because an operation on the server has exceeded the timeout specified by the client; for example, a client using the Storage Client Library can set a timeout for an operation by using the ServerTimeout property of the QueueRequestOptions class.

服务器超时指示存储服务存在需要进一步调查的问题。Server timeouts indicate a problem with the storage service that requires further investigation. 可以使用度量值了解是否已达到该服务的伸缩性限制,并确定可能会导致此问题的任何流量峰值。You can use metrics to see if you are hitting the scalability limits for the service and to identify any spikes in traffic that might be causing this problem. 如果问题是间歇性的,则可能是由于服务中的负载均衡活动导致的。If the problem is intermittent, it may be due to load-balancing activity in the service. 如果问题是持久性的,并且不是由于应用程序达到服务的伸缩性限制导致的,则应提出支持问题。If the problem is persistent and is not caused by your application hitting the scalability limits of the service, you should raise a support issue. 对于客户端超时,必须确定超时在客户端中是否设为适当的值,可更改客户端中设置的超时值,或者调查如何改善存储服务中的操作性能,例如通过优化表查询或缩小消息的大小。For client timeouts, you must decide if the timeout is set to an appropriate value in the client and either change the timeout value set in the client or investigate how you can improve the performance of the operations in the storage service, for example by optimizing your table queries or reducing the size of your messages.

度量值显示 PercentNetworkError 增加Metrics show an increase in PercentNetworkError

度量值显示其中一个存储服务的 PercentNetworkError 增加。Your metrics show an increase in PercentNetworkError for one of your storage services. PercentNetworkError 指标是以下指标的聚合:NetworkErrorAnonymousNetworkErrorSASNetworkErrorThe PercentNetworkError metric is an aggregation of the following metrics: NetworkError, AnonymousNetworkError, and SASNetworkError. 如果存储服务在客户端发出存储请求时检测到网络错误,则会出现这些错误。These occur when the storage service detects a network error when the client makes a storage request.

出现此错误的最常见原因是客户端在存储服务超时到期之前断开连接。The most common cause of this error is a client disconnecting before a timeout expires in the storage service. 应调查客户端中的代码,以了解客户端断开与存储服务的连接的原因和时间。Investigate the code in your client to understand why and when the client disconnects from the storage service. 还可以使用 Wireshark、Microsoft Message Analyzer 或 Tcping 调查客户端的网络连接问题。You can also use Wireshark, Microsoft Message Analyzer, or Tcping to investigate network connectivity issues from the client. 这些工具在附录中进行了说明。These tools are described in the Appendices.

客户端正在接收“HTTP 403 (禁止访问)”消息The client is receiving HTTP 403 (Forbidden) messages

如果客户端应用程序引发“HTTP 403(禁止)”错误,则可能的原因是客户端在发送存储请求时使用了过期的共享访问签名 (SAS)(虽然其他可能的原因包括时钟偏差、无效密钥和空标头)。If your client application is throwing HTTP 403 (Forbidden) errors, a likely cause is that the client is using an expired Shared Access Signature (SAS) when it sends a storage request (although other possible causes include clock skew, invalid keys, and empty headers). 如果已过期的 SAS 密钥是原因,则你不会在服务器端存储日志记录日志数据中看到任何条目。If an expired SAS key is the cause, you will not see any entries in the server-side Storage Logging log data. 下表显示了存储客户端库生成的客户端日志的示例,它说明了如何出现此问题:The following table shows a sample from the client-side log generated by the Storage Client Library that illustrates this issue occurring:

Source 详细程度Verbosity 详细程度Verbosity 客户端请求 IDClient request ID 操作文本Operation text
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab-… 正在按位置模式 PrimaryOnly 使用主位置启动操作。Starting operation with location Primary per location mode PrimaryOnly.
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab -… 开始将同步请求发送到 https://domemaildist.blob.core.chinacloudapi.cnazureimblobcontainer/blobCreatedViaSAS.txt?sv=2014-02-14&sr=c&si=mypolicy&sig=OFnd4Rd7z01fIvh%2BmcR6zbudIH2F5Ikm%2FyhNYZEmJNQ%3D&api-version=2014-02-14Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cnazureimblobcontainer/blobCreatedViaSAS.txt?sv=2014-02-14&sr=c&si=mypolicy&sig=OFnd4Rd7z01fIvh%2BmcR6zbudIH2F5Ikm%2FyhNYZEmJNQ%3D&api-version=2014-02-14
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab -… 正在等待响应。Waiting for response.
Microsoft.Azure.StorageMicrosoft.Azure.Storage 警告Warning 22 85d077ab -…85d077ab -… 等待响应时引发了异常:远程服务器返回了错误:(403) 禁止访问。Exception thrown while waiting for response: The remote server returned an error: (403) Forbidden.
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab -… 收到响应。Response received. 状态代码 = 403,请求 ID = 9d67c64a-64ed-4b0d-9515-3b14bbcdc63d,Content-MD5 =,ETag = 。Status code = 403, Request ID = 9d67c64a-64ed-4b0d-9515-3b14bbcdc63d, Content-MD5 = , ETag = .
Microsoft.Azure.StorageMicrosoft.Azure.Storage 警告Warning 22 85d077ab -…85d077ab -… 操作期间引发了异常:远程服务器返回了错误:(403) 禁止访问。Exception thrown during the operation: The remote server returned an error: (403) Forbidden..
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab -… 正在检查是否应重试该操作。Checking if the operation should be retried. 重试次数 = 0,HTTP 状态代码 = 403,异常 = 远程服务器返回了错误:(403) 禁止访问。Retry count = 0, HTTP status code = 403, Exception = The remote server returned an error: (403) Forbidden..
Microsoft.Azure.StorageMicrosoft.Azure.Storage 信息Information 33 85d077ab -…85d077ab -… 已根据位置模式将下一个位置设为主位置。The next location has been set to Primary, based on the location mode.
Microsoft.Azure.StorageMicrosoft.Azure.Storage 错误Error 11 85d077ab -…85d077ab -… 重试策略不允许重试。Retry policy did not allow for a retry. 操作失败,远程服务器返回了错误:(403) 禁止访问。Failing with The remote server returned an error: (403) Forbidden.

在此方案中,应调查在客户端将该令牌发送到服务器之前 SAS 令牌即将到期的原因:In this scenario, you should investigate why the SAS token is expiring before the client sends the token to the server:

  • 通常,创建 SAS 供客户端立即使用时,不应设置开始时间。Typically, you should not set a start time when you create a SAS for a client to use immediately. 如果使用当前时间生成 SAS 的主机与存储服务之间存在较小的时钟差异,则存储服务有可能收到尚未生效的 SAS。If there are small clock differences between the host generating the SAS using the current time and the storage service, then it is possible for the storage service to receive a SAS that is not yet valid.
  • 不要在 SAS 上设置太短的到期时间。Do not set a very short expiry time on a SAS. 同样,生成 SAS 的主机与存储服务之间的较小时钟差异可能会导致 SAS 似乎早于预期到期。Again, small clock differences between the host generating the SAS and the storage service can lead to a SAS apparently expiring earlier than anticipated.
  • SAS 密钥中的版本参数(例如 sv=2015-04-05)是否与正在使用的存储客户端库的版本匹配?Does the version parameter in the SAS key (for example sv=2015-04-05) match the version of the Storage Client Library you are using? 建议始终使用最新版的存储客户端库We recommend that you always use the latest version of the Storage Client Library.
  • 如果重新生成存储访问密钥,可能会使任何现有的 SAS 令牌无效。If you regenerate your storage access keys, any existing SAS tokens may be invalidated. 如果生成的 SAS 令牌具有较长的到期时间供客户端应用程序缓存,可能会出现此问题。This issue may arise if you generate SAS tokens with a long expiry time for client applications to cache.

如果使用存储客户端库生成 SAS 令牌,则可轻松生成有效令牌。If you are using the Storage Client Library to generate SAS tokens, then it is easy to build a valid token. 但是,如果使用的是存储 REST API 并手动构造 SAS 令牌,请参阅 Delegating Access with a Shared Access Signature(使用共享访问签名委派访问权限)。However, if you are using the Storage REST API and constructing the SAS tokens by hand, see Delegating Access with a Shared Access Signature.

客户端正在接收“HTTP 404 (未找到)”消息The client is receiving HTTP 404 (Not found) messages

如果客户端应用程序从服务器收到“HTTP 404(找不到)”消息,这意味着客户端正在尝试使用的对象(如实体、表、Blob、容器或队列)在存储服务中不存在。If the client application receives an HTTP 404 (Not found) message from the server, this implies that the object the client was attempting to use (such as an entity, table, blob, container, or queue) does not exist in the storage service. 有多种原因可能会导致此问题,例如:There are a number of possible reasons for this, such as:

客户端或其他进程以前删除了该对象The client or another process previously deleted the object

在客户端正在尝试读取、更新或删除存储服务中的数据的情况下,通常很容易在服务器端日志中找到以前从存储服务中删除有问题对象的操作。In scenarios where the client is attempting to read, update, or delete data in a storage service it is usually easy to identify in the server-side logs a previous operation that deleted the object in question from the storage service. 通常,日志数据显示其他用户或进程删除了该对象。Often, the log data shows that another user or process deleted the object. 当客户端删除了该对象时,在服务器端存储日志记录日志中,“操作类型”和“请求的对象键”列会显示相关信息。In the server-side Storage Logging log, the operation-type and requested-object-key columns show when a client deleted an object.

在客户端正在尝试插入对象的情况下,可能无法立即找到导致“HTTP 404(找不到)”响应的明显原因,因为客户端正在创建新对象。In the scenario where a client is attempting to insert an object, it may not be immediately obvious why this results in an HTTP 404 (Not found) response given that the client is creating a new object. 但是,如果客户端正在创建 Blob,则必须能够找到 Blob 容器;如果客户端正在创建消息,则必须能够找到队列;如果客户端正在添加行,则必须能够找到表。However, if the client is creating a blob it must be able to find the blob container, if the client is creating a message it must be able to find a queue, and if the client is adding a row it must be able to find the table.

可以使用存储客户端库生成的客户端日志更详细地了解客户端将特定请求发送到存储服务时的信息。You can use the client-side log from the Storage Client Library to gain a more detailed understanding of when the client sends specific requests to the storage service.

存储客户端库生成的以下客户端日志说明了客户端找不到它创建的 Blob 的容器时的问题。The following client-side log generated by the Storage Client library illustrates the problem when the client cannot find the container for the blob it is creating. 此日志包含以下存储操作的详细信息:This log includes details of the following storage operations:

请求 IDRequest ID 操作Operation
07b26a5d-...07b26a5d-... DeleteIfExists 方法,用于删除 Blob 容器。DeleteIfExists method to delete the blob container. 请注意,此操作包括 HEAD 请求以检查该容器是否存在。Note that this operation includes a HEAD request to check for the existence of the container.
e2d06d78-...e2d06d78… CreateIfNotExists 方法,用于创建 Blob 容器。CreateIfNotExists method to create the blob container. 请注意,此操作包括 HEAD 请求,用于检查该容器是否存在。Note that this operation includes a HEAD request that checks for the existence of the container. HEAD 返回了 404 消息,但将继续执行。The HEAD returns a 404 message but continues.
de8b1c3c-...de8b1c3c-... UploadFromStream 方法,用于创建 Blob。UploadFromStream method to create the blob. PUT 请求失败,显示 404 消息The PUT request fails with a 404 message

日志条目:Log entries:

请求 IDRequest ID 操作文本Operation Text
07b26a5d-...07b26a5d-... Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.
07b26a5d-...07b26a5d-... StringToSign = HEAD............x-ms-client-request-id:07b26a5d-....x-ms-date:Tue, 03 Jun 2014 10:33:11 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container。StringToSign = HEAD............x-ms-client-request-id:07b26a5d-....x-ms-date:Tue, 03 Jun 2014 10:33:11 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container.
07b26a5d-...07b26a5d-... 正在等待响应。Waiting for response.
07b26a5d-...07b26a5d-... 收到响应。Response received. Status code = 200, Request ID = eeead849-...Content-MD5 = , ETag = "0x8D14D2DC63D059B".Status code = 200, Request ID = eeead849-...Content-MD5 = , ETag = "0x8D14D2DC63D059B".
07b26a5d-...07b26a5d-... 响应标头已成功处理,继续执行该操作的剩余部分。Response headers were processed successfully, proceeding with the rest of the operation.
07b26a5d-...07b26a5d-... 正在下载响应正文。Downloading response body.
07b26a5d-...07b26a5d-... 操作已成功完成。Operation completed successfully.
07b26a5d-...07b26a5d-... Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.
07b26a5d-...07b26a5d-... StringToSign = DELETE............x-ms-client-request-id:07b26a5d-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container.StringToSign = DELETE............x-ms-client-request-id:07b26a5d-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container.
07b26a5d-...07b26a5d-... 正在等待响应。Waiting for response.
07b26a5d-...07b26a5d-... 收到响应。Response received. 状态代码 = 202,请求 ID = 6ab2a4cf-...,Content-MD5 = ,ETag = 。Status code = 202, Request ID = 6ab2a4cf-..., Content-MD5 = , ETag = .
07b26a5d-...07b26a5d-... 响应标头已成功处理,继续执行该操作的剩余部分。Response headers were processed successfully, proceeding with the rest of the operation.
07b26a5d-...07b26a5d-... 正在下载响应正文。Downloading response body.
07b26a5d-...07b26a5d-... 操作已成功完成。Operation completed successfully.
e2d06d78-...e2d06d78-... Starting asynchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.Starting asynchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.
e2d06d78-...e2d06d78-... StringToSign = HEAD............x-ms-client-request-id:e2d06d78-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container。StringToSign = HEAD............x-ms-client-request-id:e2d06d78-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container.
e2d06d78-...e2d06d78-... 正在等待响应。Waiting for response.
de8b1c3c-...de8b1c3c-... Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer/blobCreated.txt.Starting synchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer/blobCreated.txt.
de8b1c3c-...de8b1c3c-... StringToSign = PUT...64.qCmF+TQLPhq/YYK50mP9ZQ==........x-ms-blob-type:BlockBlob.x-ms-client-request-id:de8b1c3c-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer/blobCreated.txt。StringToSign = PUT...64.qCmF+TQLPhq/YYK50mP9ZQ==........x-ms-blob-type:BlockBlob.x-ms-client-request-id:de8b1c3c-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer/blobCreated.txt.
de8b1c3c-...de8b1c3c-... 正在准备写入请求数据。Preparing to write request data.
e2d06d78-...e2d06d78-... 等待响应时引发了异常:远程服务器返回了错误:(404) 未找到。Exception thrown while waiting for response: The remote server returned an error: (404) Not Found..
e2d06d78-...e2d06d78-... 收到响应。Response received. 状态代码 = 404,请求 ID = 353ae3bc-...,Content-MD5 = ,ETag = 。Status code = 404, Request ID = 353ae3bc-..., Content-MD5 = , ETag = .
e2d06d78-...e2d06d78-... 响应标头已成功处理,继续执行该操作的剩余部分。Response headers were processed successfully, proceeding with the rest of the operation.
e2d06d78-...e2d06d78-... 正在下载响应正文。Downloading response body.
e2d06d78-...e2d06d78-... 操作已成功完成。Operation completed successfully.
e2d06d78-...e2d06d78-... Starting asynchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.Starting asynchronous request to https://domemaildist.blob.core.chinacloudapi.cn/azuremmblobcontainer.
e2d06d78-...e2d06d78-... StringToSign = PUT...0.........x-ms-client-request-id:e2d06d78-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container。StringToSign = PUT...0.........x-ms-client-request-id:e2d06d78-....x-ms-date:Tue, 03 Jun 2014 10:33:12 GMT.x-ms-version:2014-02-14./domemaildist/azuremmblobcontainer.restype:container.
e2d06d78-...e2d06d78-... 正在等待响应。Waiting for response.
de8b1c3c-...de8b1c3c-... 正在写入请求数据。Writing request data.
de8b1c3c-...de8b1c3c-... 正在等待响应。Waiting for response.
e2d06d78-...e2d06d78-... 等待响应时引发了异常:远程服务器返回了错误:(409) 冲突。Exception thrown while waiting for response: The remote server returned an error: (409) Conflict..
e2d06d78-...e2d06d78-... 收到响应。Response received. 状态代码 = 409,请求 ID = c27da20e-...,Content-MD5 = ,ETag = 。Status code = 409, Request ID = c27da20e-..., Content-MD5 = , ETag = .
e2d06d78-...e2d06d78-... 正在下载错误响应正文。Downloading error response body.
de8b1c3c-...de8b1c3c-... 等待响应时引发了异常:远程服务器返回了错误:(404) 未找到。Exception thrown while waiting for response: The remote server returned an error: (404) Not Found..
de8b1c3c-...de8b1c3c-... 收到响应。Response received. 状态代码 = 404,请求 ID = 0eaeab3e-...,Content-MD5 = ,ETag = 。Status code = 404, Request ID = 0eaeab3e-..., Content-MD5 = , ETag = .
de8b1c3c-...de8b1c3c-... 操作期间引发了异常:远程服务器返回了错误:(404) 未找到。Exception thrown during the operation: The remote server returned an error: (404) Not Found..
de8b1c3c-...de8b1c3c-... 重试策略不允许重试。Retry policy did not allow for a retry. 操作失败,远程服务器返回了错误:(404) 未找到。Failing with The remote server returned an error: (404) Not Found..
e2d06d78-...e2d06d78-... 重试策略不允许重试。Retry policy did not allow for a retry. 操作失败,远程服务器返回了错误:(409) 冲突。Failing with The remote server returned an error: (409) Conflict..

在此示例中,该日志显示客户端正在交错执行 CreateIfNotExists 方法发出的请求(请求 ID e2d06d78…)与 UploadFromStream 方法发出的请求 (de8b1c3c-...)。之所以发生此交错执行是因为客户端应用程序正在以异步方式调用这两个方法。In this example, the log shows that the client is interleaving requests from the CreateIfNotExists method (request ID e2d06d78…) with the requests from the UploadFromStream method (de8b1c3c-...). This interleaving happens because the client application is invoking these methods asynchronously. 应修改客户端中的异步代码,以确保客户端在尝试将任何数据上传到该容器中的 Blob 之前已创建该容器。Modify the asynchronous code in the client to ensure that it creates the container before attempting to upload any data to a blob in that container. 理想情况下,应该提前创建所有容器。Ideally, you should create all your containers in advance.

共享访问签名 (SAS) 授权问题A Shared Access Signature (SAS) authorization issue

如果客户端应用程序尝试使用不包括必要的操作权限的 SAS 密钥,则存储服务向客户端返回“HTTP 404 (未找到)”消息。If the client application attempts to use a SAS key that does not include the necessary permissions for the operation, the storage service returns an HTTP 404 (Not found) message to the client. 同时,还会在度量值中看到 SASAuthorizationError 为非零值。At the same time, you will also see a non-zero value for SASAuthorizationError in the metrics.

下表显示了存储日志记录日志文件中的示例服务器端日志消息:The following table shows a sample server-side log message from the Storage Logging log file:

名称Name Value
请求开始时间Request start time 2014-05-30T06:17:48.4473697Z2014-05-30T06:17:48.4473697Z
操作类型Operation type GetBlobPropertiesGetBlobProperties
请求状态Request status SASAuthorizationErrorSASAuthorizationError
HTTP 状态代码HTTP status code 404404
身份验证类型Authentication type SasSas
服务类型Service type BlobBlob
请求 URLRequest URL https://domemaildist.blob.core.chinacloudapi.cn/azureimblobcontainer/blobCreatedViaSAS.txt
  ?sv=2014-02-14&sr=c&si=mypolicy&sig=XXXXX&;api-version=2014-02-14?sv=2014-02-14&sr=c&si=mypolicy&sig=XXXXX&;api-version=2014-02-14
请求 ID 标头Request ID header a1f348d5-8032-4912-93ef-b393e5252a3ba1f348d5-8032-4912-93ef-b393e5252a3b
客户端请求 IDClient request ID 2d064953-8436-4ee0-aa0c-65cb874f79292d064953-8436-4ee0-aa0c-65cb874f7929

应调查客户端应用程序尝试执行尚未向其授予权限的操作的原因。Investigate why your client application is attempting to perform an operation for which it has not been granted permissions.

客户端 JavaScript 代码无权访问该对象Client-side JavaScript code does not have permission to access the object

如果使用的是 JavaScript 客户端并且存储服务返回 HTTP 404 消息,则应在浏览器中检查以下 JavaScript 错误:If you are using a JavaScript client and the storage service is returning HTTP 404 messages, you check for the following JavaScript errors in the browser:

SEC7120: Origin http://localhost:56309 not found in Access-Control-Allow-Origin header.
SCRIPT7002: XMLHttpRequest: Network Error 0x80070005, Access is denied.

Note

在排查客户端 JavaScript 问题时,可以使用 Internet Explorer 中的 F12 开发人员工具跟踪浏览器与存储服务之间交换的消息。You can use the F12 Developer Tools in Internet Explorer to trace the messages exchanged between the browser and the storage service when you are troubleshooting client-side JavaScript issues.

之所以发生这些错误是因为 Web 浏览器实施了同源策略安全限制,以防止网页调用与它来自的域不同的域中的 API。These errors occur because the web browser implements the same origin policy security restriction that prevents a web page from calling an API in a different domain from the domain the page comes from.

若要解决此 JavaScript 问题,可以为客户端访问的存储服务配置跨域资源共享 (CORS)。To work around the JavaScript issue, you can configure Cross Origin Resource Sharing (CORS) for the storage service the client is accessing. 有关详细信息,请参阅 Cross-Origin Resource Sharing (CORS) Support for Azure Storage Services(Azure 存储服务的跨域资源共享 (CORS) 支持)。For more information, see Cross-Origin Resource Sharing (CORS) Support for Azure Storage Services.

下面的代码示例演示如何配置 Blob 服务,以允许在 Contoso 域中运行的 JavaScript 访问 Blob 存储服务中的 Blob:The following code sample shows how to configure your blob service to allow JavaScript running in the Contoso domain to access a blob in your blob storage service:

CloudBlobClient client = new CloudBlobClient(blobEndpoint, new StorageCredentials(accountName, accountKey));
// Set the service properties.
ServiceProperties sp = client.GetServiceProperties();
sp.DefaultServiceVersion = "2013-08-15";
CorsRule cr = new CorsRule();
cr.AllowedHeaders.Add("*");
cr.AllowedMethods = CorsHttpMethods.Get | CorsHttpMethods.Put;
cr.AllowedOrigins.Add("http://www.contoso.com");
cr.ExposedHeaders.Add("x-ms-*");
cr.MaxAgeInSeconds = 5;
sp.Cors.CorsRules.Clear();
sp.Cors.CorsRules.Add(cr);
client.SetServiceProperties(sp);

网络故障Network Failure

在某些情况下,丢失的网络数据包可能会导致存储服务向客户端返回 HTTP 404 消息。In some circumstances, lost network packets can lead to the storage service returning HTTP 404 messages to the client. 例如,当客户端应用程序从表服务中删除某个实体时,看到客户端从表服务引发报告“HTTP 404(找不到)”状态消息的存储异常。For example, when your client application is deleting an entity from the table service you see the client throw a storage exception reporting an "HTTP 404 (Not Found)" status message from the table service. 调查表存储服务中的表时,看到该服务确实删除了请求的实体。When you investigate the table in the table storage service, you see that the service did delete the entity as requested.

客户端中的异常详细信息包括表服务为请求分配的请求 ID (7e84f12d...):可以通过在日志文件的 request-id-header 列中搜索,来使用此信息在服务器端存储日志中查找请求详细信息。The exception details in the client include the request ID (7e84f12d…) assigned by the table service for the request: you can use this information to locate the request details in the server-side storage logs by searching in the request-id-header column in the log file. 此外,还可以使用度量值来确定此类失败何时发生,并基于度量值记录此错误的时间搜索日志文件。You could also use the metrics to identify when failures such as this occur and then search the log files based on the time the metrics recorded this error. 此日志项显示删除失败并返回“HTTP (404) 客户端其他错误”状态消息。This log entry shows that the delete failed with an "HTTP (404) Client Other Error" status message. 同一日志条目还在 client-request-id 列中包括客户端生成的请求 ID (813ea74f…)。The same log entry also includes the request ID generated by the client in the client-request-id column (813ea74f…).

服务器端日志文件还包含另一个具有同一 client-request-id 值 (813ea74f…) 的条目,该条目针对从同一客户端对同一实体进行的成功删除操作。The server-side log also includes another entry with the same client-request-id value (813ea74f…) for a successful delete operation for the same entity, and from the same client. 此成功的删除操作在失败的删除请求之前很短的时间内发生。This successful delete operation took place very shortly before the failed delete request.

此情况最有可能的原因是客户端将针对实体的删除请求发送到表服务,该请求成功,但未从服务器收到确认消息(可能是因为临时网络问题)。The most likely cause of this scenario is that the client sent a delete request for the entity to the table service, which succeeded, but did not receive an acknowledgement from the server (perhaps due to a temporary network issue). 然后,客户端自动重试该操作(使用同一 client-request-id ),但此重试失败,因为该实体已删除。The client then automatically retried the operation (using the same client-request-id), and this retry failed because the entity had already been deleted.

如果此问题频繁出现,应该调查为什么客户端无法从表服务收到确认消息。If this problem occurs frequently, you should investigate why the client is failing to receive acknowledgements from the table service. 如果此问题是间歇性的,则应捕获“HTTP (404) 找不到”错误并在客户端中记录它,但允许客户端继续执行。If the problem is intermittent, you should trap the "HTTP (404) Not Found" error and log it in the client, but allow the client to continue.

客户端正在接收“HTTP 409 (冲突)”消息The client is receiving HTTP 409 (Conflict) messages

下表显示了服务器端日志中针对两个客户端操作的摘录:DeleteIfExists 后面紧接使用相同 Blob 容器名称的 CreateIfNotExistsThe following table shows an extract from the server-side log for two client operations: DeleteIfExists followed immediately by CreateIfNotExists using the same blob container name. 每个客户端操作会导致将两个请求发送到服务器,先是 GetContainerProperties 请求(用于检查容器是否存在),后跟 DeleteContainerCreateContainer 请求。Each client operation results in two requests sent to the server, first a GetContainerProperties request to check if the container exists, followed by the DeleteContainer or CreateContainer request.

TimestampTimestamp 操作Operation 结果Result 容器名称Container name 客户端请求 IDClient request ID
05:10:13.716722505:10:13.7167225 GetContainerPropertiesGetContainerProperties 200200 mmcontmmcont c9f52c89-...c9f52c89-…
05:10:13.816732505:10:13.8167325 DeleteContainerDeleteContainer 202202 mmcontmmcont c9f52c89-...c9f52c89-…
05:10:13.898740705:10:13.8987407 GetContainerPropertiesGetContainerProperties 404404 mmcontmmcont bc881924-...bc881924-…
05:10:14.214772305:10:14.2147723 CreateContainerCreateContainer 409409 mmcontmmcont bc881924-...bc881924-…

客户端应用程序中的代码先删除 blob 容器,然后立即使用同一名称重新创建该容器:CreateIfNotExists 方法(客户端请求 ID bc881924-...)最终失败,显示“HTTP 409 (冲突)”错误。The code in the client application deletes and then immediately recreates a blob container using the same name: the CreateIfNotExists method (Client request ID bc881924-…) eventually fails with the HTTP 409 (Conflict) error. 当客户端删除 Blob 容器、表或队列时,需要经过一段较短的时间,该名称才能再次变为可用。When a client deletes blob containers, tables, or queues there is a brief period before the name becomes available again.

客户端应用程序在创建新容器时应使用唯一的容器名称(如果“删除/重新创建”模式很常见)。The client application should use unique container names whenever it creates new containers if the delete/recreate pattern is common.

度量值显示 PercentSuccess,或者分析日志条目包含事务状态为 ClientOtherErrors 的操作Metrics show low PercentSuccess or analytics log entries have operations with transaction status of ClientOtherErrors

PercentSuccess 度量值根据操作的 HTTP 状态代码捕获已成功的操作的百分比。The PercentSuccess metric captures the percent of operations that were successful based on their HTTP Status Code. 状态代码为 2XX 的操作将计为成功,而状态代码在 3XX、4XX 和 5XX 范围内的操作将计为失败并降低 PercentSuccess 指标值。Operations with status codes of 2XX count as successful, whereas operations with status codes in 3XX, 4XX and 5XX ranges are counted as unsuccessful and lower the PercentSuccess metric value. 在服务器端存储日志文件中,这些操作使用事务状态 ClientOtherErrors进行记录。In the server-side storage log files, these operations are recorded with a transaction status of ClientOtherErrors.

请务必注意,这些操作已成功完成,因此不会影响其他度量值,如可用性。It is important to note that these operations have completed successfully and therefore do not affect other metrics such as availability. 成功执行但可能会导致失败的 HTTP 状态代码的一些操作示例包括:Some examples of operations that execute successfully but that can result in unsuccessful HTTP status codes include:

  • ResourceNotFound(未找到 404),例如,对不存在的 blob 进行 GET 请求时生成。ResourceNotFound (Not Found 404), for example from a GET request to a blob that does not exist.
  • ResourceAlreadyExists(冲突 409),例如,在资源已存在的情况下进行 CreateIfNotExist 操作时生成。ResourceAlreadyExists (Conflict 409), for example from a CreateIfNotExist operation where the resource already exists.
  • ConditionNotMet(未修改 304),例如,进行条件操作时生成,例如仅在自上次操作以来映像已更新时,客户端才会发送 ETag 值和一个 HTTP If-None-Match 标头来请求此映像。ConditionNotMet (Not Modified 304), for example from a conditional operation such as when a client sends an ETag value and an HTTP If-None-Match header to request an image only if it has been updated since the last operation.

可以在常见的 REST API 错误代码页上找到存储服务返回的常见 REST API 错误代码的列表。You can find a list of common REST API error codes that the storage services return on the page Common REST API Error Codes.

容量度量值显示存储容量使用量意外增加Capacity metrics show an unexpected increase in storage capacity usage

如果发现存储帐户中的容量使用量意外突变,可调查其原因,具体方法是先查看可用性指标;例如,失败的删除请求数增加可能导致所用的 Blob 存储量增加,本来希望应用程序特定的清理操作可释放一些空间,但却未按预期工作(例如,因为用于释放空间的 SAS 令牌已过期)。If you see sudden, unexpected changes in capacity usage in your storage account, you can investigate the reasons by first looking at your availability metrics; for example, an increase in the number of failed delete requests might lead to an increase in the amount of blob storage you are using as application-specific cleanup operations you might have expected to be freeing up space may not be working as expected (for example, because the SAS tokens used for freeing up space have expired).

你的问题是由于使用存储模拟器进行开发或测试而导致Your issue arises from using the storage emulator for development or test

通常,在开发和测试过程中使用存储模拟器以避免需要 Azure 存储帐户。You typically use the storage emulator during development and test to avoid the requirement for an Azure storage account. 使用存储模拟器时可能发生的常见问题包括:The common issues that can occur when you are using the storage emulator are:

功能“X”在存储模拟器中无法正常工作Feature "X" is not working in the storage emulator

存储模拟器并非支持 Azure 存储服务的所有功能,如文件服务。The storage emulator does not support all of the features of the Azure storage services such as the file service. 有关详细信息,请参阅 使用 Azure 存储模拟器进行开发和测试For more information, see Use the Azure Storage Emulator for Development and Testing.

对于存储模拟器不支持的这些功能,请使用云中的 Azure 存储服务。For those features that the storage emulator does not support, use the Azure storage service in the cloud.

使用存储模拟器时出现错误“其中一个 HTTP 标头的值的格式不正确”Error "The value for one of the HTTP headers is not in the correct format" when using the storage emulator

正在针对本地存储模拟器测试使用存储客户端库的应用程序,方法调用(如 CreateIfNotExists)失败并显示错误消息“其中一个 HTTP 标头值的格式不正确” 。You are testing your application that uses the Storage Client Library against the local storage emulator and method calls such as CreateIfNotExists fail with the error message "The value for one of the HTTP headers is not in the correct format." 这表示所用的存储模拟器版本不支持所用的存储客户端库版本。This indicates that the version of the storage emulator you are using does not support the version of the storage client library you are using. 存储客户端库会为它发出的所有请求添加标头 x-ms-versionThe Storage Client Library adds the header x-ms-version to all the requests it makes. 如果存储模拟器无法识别 x-ms-version 标头中的值,则会拒绝该请求。If the storage emulator does not recognize the value in the x-ms-version header, it rejects the request.

可以使用存储库客户端日志来查看其发送的 x-ms-version header 值。You can use the Storage Library Client logs to see the value of the x-ms-version header it is sending. 如果使用 Fiddler 跟踪客户端应用程序发出的请求,也可以查看 x-ms-version 标头 的值。You can also see the value of the x-ms-version header if you use Fiddler to trace the requests from your client application.

如果安装并使用存储客户端库的最新版本,但未更新存储模拟器,通常会出现这种情况。This scenario typically occurs if you install and use the latest version of the Storage Client Library without updating the storage emulator. 应安装存储模拟器的最新版本,或者使用云存储而不是模拟器进行开发和测试。You should either install the latest version of the storage emulator, or use cloud storage instead of the emulator for development and test.

运行存储模拟器需要管理权限Running the storage emulator requires administrative privileges

运行存储模拟器时,系统提示提供管理员凭据。You are prompted for administrator credentials when you run the storage emulator. 仅首次初始化存储模拟器时,才会出现这种情况。This only occurs when you are initializing the storage emulator for the first time. 初始化存储模拟器后,将无需管理权限即可再次运行。After you have initialized the storage emulator, you do not need administrative privileges to run it again.

有关详细信息,请参阅 使用 Azure 存储模拟器进行开发和测试For more information, see Use the Azure Storage Emulator for Development and Testing. 也可以在 Visual Studio 中初始化存储模拟器,但这也需要管理特权。You can also initialize the storage emulator in Visual Studio, which will also require administrative privileges.

安装用于 .NET 的 Azure SDK 时遇到问题You are encountering problems installing the Azure SDK for .NET

尝试安装 SDK 时,它尝试在本地计算机上安装存储模拟器时失败。When you try to install the SDK, it fails trying to install the storage emulator on your local machine. 安装日志包含以下消息之一:The installation log contains one of the following messages:

  • CAQuietExec:错误:无法访问 SQL 实例CAQuietExec: Error: Unable to access SQL instance
  • CAQuietExec:错误:无法创建数据库CAQuietExec: Error: Unable to create database

原因是现有 LocalDB 安装有问题。The cause is an issue with existing LocalDB installation. 默认情况下,存储模拟器在模拟 Azure 存储服务时使用 LocalDB 持久保存数据。By default, the storage emulator uses LocalDB to persist data when it simulates the Azure storage services. 可以在尝试安装 SDK 之前,通过在命令提示符窗口中运行以下命令,重置 LocalDB 实例。You can reset your LocalDB instance by running the following commands in a command-prompt window before trying to install the SDK.

sqllocaldb stop v11.0
sqllocaldb delete v11.0
delete %USERPROFILE%\WAStorageEmulatorDb3*.*
sqllocaldb create v11.0

delete 命令可从以前安装的存储模拟器中删除任何旧的数据库文件。The delete command removes any old database files from previous installations of the storage emulator.

你遇到了其他存储服务问题You have a different issue with a storage service

如果前面的故障排除章节未包括你遇到的存储服务问题,则应采用以下方法诊断和排查问题。If the previous troubleshooting sections do not include the issue you are having with a storage service, you should adopt the following approach to diagnosing and troubleshooting your issue.

  • 检查度量值,了解与预期的基准行为相比是否存在任何更改。Check your metrics to see if there is any change from your expected base-line behavior. 通过度量值,可能能够确定此问题是暂时的还是永久性的,并可确定此问题影响哪些存储操作。From the metrics, you may be able to determine whether the issue is transient or permanent, and which storage operations the issue is affecting.
  • 可以使用度量值信息来搜索服务器端日志数据,获取有关发生的任何错误的更多详细信息。You can use the metrics information to help you search your server-side log data for more detailed information about any errors that are occurring. 此信息可能有助于排查和解决该问题。This information may help you troubleshoot and resolve the issue.
  • 如果服务器端日志中的信息不足以成功排查此问题,则可以使用存储客户端库客户端日志来调查客户端应用程序和工具(如 Fiddler、Wireshark 和 Microsoft Message Analyzer)的行为以调查网络。If the information in the server-side logs is not sufficient to troubleshoot the issue successfully, you can use the Storage Client Library client-side logs to investigate the behavior of your client application, and tools such as Fiddler, Wireshark, and Microsoft Message Analyzer to investigate your network.

有关使用 Fiddler 的详细信息,请参阅附录 1:使用 Fiddler 捕获 HTTP 和 HTTPS 流量For more information about using Fiddler, see "Appendix 1: Using Fiddler to capture HTTP and HTTPS traffic."

有关使用 Wireshark 的详细信息,请参阅附录 2:使用 Wireshark 捕获网络流量For more information about using Wireshark, see "Appendix 2: Using Wireshark to capture network traffic."

有关使用 Microsoft Message Analyzer 的详细信息,请参阅附录 3:使用 Microsoft Message Analyzer 捕获网络流量For more information about using Microsoft Message Analyzer, see "Appendix 3: Using Microsoft Message Analyzer to capture network traffic."

附录Appendices

附录介绍几种在诊断和排查 Azure 存储(及其他服务)问题时可能很有用的工具。The appendices describe several tools that you may find useful when you are diagnosing and troubleshooting issues with Azure Storage (and other services). 这些工具不属于 Azure 存储,有些工具是第三方产品。These tools are not part of Azure Storage and some are third-party products. 因此,这些附录中介绍的工具可能在签署的有关 Azure 或 Azure 存储的任何支持协议中均未涉及,因此,在评估过程中应查看这些工具的提供者提供的许可和支持选项。As such, the tools discussed in these appendices are not covered by any support agreement you may have with Azure or Azure Storage, and therefore as part of your evaluation process you should examine the licensing and support options available from the providers of these tools.

附录 1:使用 Fiddler 捕获 HTTP 和 HTTPS 流量Appendix 1: Using Fiddler to capture HTTP and HTTPS traffic

Fiddler 是一个有用的工具,用于分析客户端应用程序与所用的 Azure 存储服务之间的 HTTP 和 HTTPS 通信。Fiddler is a useful tool for analyzing the HTTP and HTTPS traffic between your client application and the Azure storage service you are using.

Note

Fiddler 可以解码 HTTPS 通信;应仔细阅读 Fiddler 文档以了解它如何执行此操作,并了解安全隐患。Fiddler can decode HTTPS traffic; you should read the Fiddler documentation carefully to understand how it does this, and to understand the security implications.

本附录提供了一个简要演练,介绍如何配置 Fiddler 以捕获已安装 Fiddler 的本地计算机与 Azure 存储服务之间的流量。This appendix provides a brief walkthrough of how to configure Fiddler to capture traffic between the local machine where you have installed Fiddler and the Azure storage services.

启动 Fiddler 后,它将开始捕获本地计算机上的 HTTP 和 HTTPS 通信。After you have launched Fiddler, it will begin capturing HTTP and HTTPS traffic on your local machine. 以下是一些用于控制 Fiddler 的有用命令:The following are some useful commands for controlling Fiddler:

  • 停止和启动捕获流量。Stop and start capturing traffic. 在主菜单上,转到“文件” ,然后单击“捕获流量” 在打开和关闭捕获之间进行切换。On the main menu, go to File and then click Capture Traffic to toggle capturing on and off.
  • 保存捕获的通信数据。Save captured traffic data. 在主菜单上,转到“文件” ,单击“保存” ,然后单击“所有会话” ,这样即可将流量保存在一个会话存档文件中。On the main menu, go to File, click Save, and then click All Sessions: this enables you to save the traffic in a Session Archive file. 以后可以重新加载“会话存档”以供分析,或者将其发送到 Azure 支持部门(如果要求)。You can reload a Session Archive later for analysis, or send it if requested to Azure support.

若要限制 Fiddler 捕获的通信量,可以使用在“筛选器” 选项卡中配置的筛选器。contosoemaildist.table.core.chinacloudapi.cn 存储终结点的流量的筛选器:To limit the amount of traffic that Fiddler captures, you can use filters that you configure in the Filters tab. The following screenshot shows a filter that captures only traffic sent to the contosoemaildist.table.core.chinacloudapi.cn storage endpoint:

附录 2:使用 Wireshark 捕获网络流量Appendix 2: Using Wireshark to capture network traffic

Wireshark 是一种网络协议分析器,可用于查看各种网络协议的详细数据包信息。Wireshark is a network protocol analyzer that enables you to view detailed packet information for a wide range of network protocols.

以下过程演示,对于从安装 Wireshark 的本地计算机到 Azure 存储帐户中的表服务的流量,如何捕获其详细数据包信息。The following procedure shows you how to capture detailed packet information for traffic from the local machine where you installed Wireshark to the table service in your Azure storage account.

  1. 在本地计算机上启动 Wireshark。Launch Wireshark on your local machine.

  2. 在“启动” 部分中,选择本地网络接口或连接到 Internet 的接口。In the Start section, select the local network interface or interfaces that are connected to the internet.

  3. 单击“捕获选项” 。Click Capture Options.

  4. 将一个筛选器添加到“捕获筛选器” 文本框中。Add a filter to the Capture Filter textbox. 例如,host contosoemaildist.table.core.chinacloudapi.cn 会将 Wireshark 配置为只捕获发送到 contosoemaildist 存储帐户中的表服务终结点或从该终结点发送的数据包。For example, host contosoemaildist.table.core.chinacloudapi.cn will configure Wireshark to capture only packets sent to or from the table service endpoint in the contosoemaildist storage account. 请查看捕获筛选器的完整列表Check out the complete list of Capture Filters.

  5. 单击“启动” 。Click Start. 现在,当在本地计算机上使用客户端应用程序时,Wireshark 将捕获发送到表服务终结点或从该终结点发送的所有数据包。Wireshark will now capture all the packets send to or from the table service endpoint as you use your client application on your local machine.

  6. 完成后,在主菜单上,依次单击“捕获” 和“停止” 。When you have finished, on the main menu click Capture and then Stop.

  7. 若要将捕获的数据保存到 Wireshark 捕获文件中,请在主菜单上依次单击“文件” 和“保存” 。To save the captured data in a Wireshark Capture File, on the main menu click File and then Save.

WireShark 会在 packetlist 窗口中突出显示存在的任何错误。WireShark will highlight any errors that exist in the packetlist window. 还可以使用“专家信息” 窗口(依次单击“分析” 和“专家信息” )来查看错误和警告的摘要。You can also use the Expert Info window (click Analyze, then Expert Info) to view a summary of errors and warnings.

还可选择查看 TCP 数据(如果应用程序层看到该数据),方法是右键单击 TCP 数据,并选择“跟踪 TCP 流” 。You can also choose to view the TCP data as the application layer sees it by right-clicking on the TCP data and selecting Follow TCP Stream. 在不使用捕获筛选器捕获了转储时,此方法很有用。This is useful if you captured your dump without a capture filter. 有关详细信息,请参阅 Following TCP Streams(跟踪 TCP 流)。For more information, see Following TCP Streams.

Note

有关使用 Wireshark 的详细信息,请参阅 Wireshark Users Guide(Wireshark 用户指南)。For more information about using Wireshark, see the Wireshark Users Guide.

附录 3:使用 Microsoft Message Analyzer 捕获网络流量Appendix 3: Using Microsoft Message Analyzer to capture network traffic

可以使用 Microsoft Message Analyzer 以与 Fiddler 类似的方式捕获 HTTP 和 HTTPS 流量,并以与 Wireshark 类似的方式捕获网络流量。You can use Microsoft Message Analyzer to capture HTTP and HTTPS traffic in a similar way to Fiddler, and capture network traffic in a similar way to Wireshark.

使用 Microsoft Message Analyzer 配置 Web 跟踪会话Configure a web tracing session using Microsoft Message Analyzer

若要使用 Microsoft Message Analyzer 为 HTTP 和 HTTPS 通信配置 Web 跟踪会话,请运行 Microsoft Message Analyzer 应用程序,然后在“文件” 菜单上单击“捕获/跟踪” 。To configure a web tracing session for HTTP and HTTPS traffic using Microsoft Message Analyzer, run the Microsoft Message Analyzer application and then on the File menu, click Capture/Trace. 在可用的跟踪方案列表中,选择“Web 代理” 。In the list of available trace scenarios, select Web Proxy. 然后在“跟踪方案配置” 面板的“HostnameFilter” 文本框中,添加存储终结点的名称(可以在 Azure 门户中查找这些名称)。Then in the Trace Scenario Configuration panel, in the HostnameFilter textbox, add the names of your storage endpoints (you can look up these names in the Azure portal). 例如,如果 Azure 存储帐户的名称是 contosodata ,则应将以下内容添加到 HostnameFilter 文本框:For example, if the name of your Azure storage account is contosodata, you should add the following to the HostnameFilter textbox:

contosodata.blob.core.chinacloudapi.cn contosodata.table.core.chinacloudapi.cn contosodata.queue.core.chinacloudapi.cn

Note

空格字符分隔主机名。A space character separates the hostnames.

当准备好开始收集跟踪数据时,请单击“就此开始” 按钮。When you are ready to start collecting trace data, click the Start With button.

有关 Microsoft Message Analyzer Web 代理跟踪的详细信息,请参阅 Microsoft-PEF-WebProxy Provider(Microsoft-PEF-WebProxy 提供程序)。For more information about the Microsoft Message Analyzer Web Proxy trace, see Microsoft-PEF-WebProxy Provider.

Microsoft Message Analyzer 中内置的“Web 代理” 跟踪基于 Fiddler;它可以捕获客户端 HTTPS 通信,并显示未加密的 HTTPS 消息。The built-in Web Proxy trace in Microsoft Message Analyzer is based on Fiddler; it can capture client-side HTTPS traffic and display unencrypted HTTPS messages. “Web 代理” 跟踪的工作原理是通过为所有 HTTP 和 HTTPS 流量配置本地代理使其可以访问未加密的消息。The Web Proxy trace works by configuring a local proxy for all HTTP and HTTPS traffic that gives it access to unencrypted messages.

使用 Microsoft Message Analyzer 诊断网络问题Diagnosing network issues using Microsoft Message Analyzer

除了使用 Microsoft Message Analyzer Web 代理跟踪来捕获客户端应用程序和存储服务之间的 HTTP/HTTPS 流量的详细信息外,还可以使用内置的本地链路层跟踪来捕获网络数据包信息。In addition to using the Microsoft Message Analyzer Web Proxy trace to capture details of the HTTP/HTTPs traffic between the client application and the storage service, you can also use the built-in Local Link Layer trace to capture network packet information. 此能够实现捕获类似于使用 Wireshark 捕获的数据,并诊断丢弃的数据包等网络问题。This enables you to capture data similar to that which you can capture with Wireshark, and diagnose network issues such as dropped packets.

下面的屏幕截图显示了本地链路层跟踪的一个示例,其中一些信息性消息显示在 DiagnosisTypes 列中。The following screenshot shows an example Local Link Layer trace with some informational messages in the DiagnosisTypes column. 单击 DiagnosisTypes 列中的图标可显示消息的详细信息。Clicking on an icon in the DiagnosisTypes column shows the details of the message. 在此示例中,服务器重新传输了消息 #305,因为它未收到来自客户端的确认消息:In this example, the server retransmitted message #305 because it did not receive an acknowledgement from the client:

当在 Microsoft Message Analyzer 中创建跟踪会话时,可以指定筛选器,以减少跟踪中的干扰项量。When you create the trace session in Microsoft Message Analyzer, you can specify filters to reduce the amount of noise in the trace. 在定义跟踪的“捕获/跟踪” 页上,单击 Microsoft-Windows-NDIS-PacketCapture 旁边的“配置” 链接。On the Capture / Trace page where you define the trace, click on the Configure link next to Microsoft-Windows-NDIS-PacketCapture. 下面的屏幕截图显示了筛选三个存储服务的 IP 地址的 TCP 通信的配置:The following screenshot shows a configuration that filters TCP traffic for the IP addresses of three storage services:

有关 Microsoft Message Analyzer 本地链路层跟踪的详细信息,请参阅 Microsoft-PEF-NDIS-PacketCapture Provider(Microsoft-PEF-NDIS-PacketCapture 提供程序)。For more information about the Microsoft Message Analyzer Local Link Layer trace, see Microsoft-PEF-NDIS-PacketCapture Provider.

附录 4:使用 Excel 查看指标和日志数据Appendix 4: Using Excel to view metrics and log data

使用许多工具可以从 Azure 表存储中下载带分隔符格式的存储指标数据,以便可以轻松地将这些数据加载到 Excel 中以供查看和分析。Many tools enable you to download the Storage Metrics data from Azure table storage in a delimited format that makes it easy to load the data into Excel for viewing and analysis. 来自 Azure Blob 存储的存储日志记录数据已采用带分隔符格式加载到 Excel 中。Storage Logging data from Azure blob storage is already in a delimited format that you can load into Excel. 但是,需要基于存储分析日志格式存储分析度量表架构中的信息添加相应的列标题。However, you will need to add appropriate column headings based in the information at Storage Analytics Log Format and Storage Analytics Metrics Table Schema.

要将存储日志记录数据导入 Excel(从 Blob 存储下载后),请执行以下操作:To import your Storage Logging data into Excel after you download it from blob storage:

  • 在“数据” 菜单上,单击“从文本” 。On the Data menu, click From Text.
  • 浏览到要查看的日志文件,并单击“导入” 。Browse to the log file you want to view and click Import.
  • 在“文本导入向导” 的第 1 步中,选择“带分隔符” 。On step 1 of the Text Import Wizard, select Delimited.

在“文本导入向导” 的第 1 步中,选择分号 作为唯一的分隔符,然后选择双引号作为文本限定符 。On step 1 of the Text Import Wizard, select Semicolon as the only delimiter and choose double-quote as the Text qualifier. 然后单击“完成” ,并选择数据在工作簿中的位置。Then click Finish and choose where to place the data in your workbook.

后续步骤Next steps

有关 Azure 存储中的分析的详细信息,请参阅以下资源:For more information about analytics in Azure Storage, see these resources: