在 Web 应用程序中监视性能Monitor performance in web applications

确保应用程序性能良好,并快速查明任何故障。Make sure your application is performing well, and find out quickly about any failures. Application Insights将告知任何性能问题和异常,并帮助查找并诊断根本原因。Application Insights will tell you about any performance issues and exceptions, and help you find and diagnose the root causes.

Application Insights 可监视 Java 和 ASP.NET Web 应用程序和服务、WCF 服务。Application Insights can monitor both Java and ASP.NET web applications and services, WCF services. 可在本地、虚拟机上或作为世纪互联 Azure 网站托管它们。They can be hosted on-premises, on virtual machines, or as 21Vianet Azure websites.

在客户端,Application Insights 可从网页和各种设备(包括 iOS、Android 和 Windows 应用商店应用)获取遥测。On the client side, Application Insights can take telemetry from web pages and a wide variety of devices including iOS, Android, and Windows Store apps.

设置性能监视Set up performance monitoring

如果尚未将 Application Insights 添加到项目(即,如果它没有 ApplicationInsights.config),则选择以下方式之一开始操作:If you haven't yet added Application Insights to your project (that is, if it doesn't have ApplicationInsights.config), choose one of these ways to get started:

探索性能指标Exploring performance metrics

Azure 门户中,浏览到为应用程序设置的 Application Insights 资源。In the Azure portal, browse to the Application Insights resource that you set up for your application. “概述”边栏选项卡显示基本性能数据:The overview blade shows basic performance data:

单击任意图表查看更多详细信息,并查看更长时间段的结果。Click any chart to see more detail, and to see results for a longer period. 例如,单击“请求”磁贴,并选择时间范围:For example, click the Requests tile and then select a time range:

通过单击转到更多数据并选择时间范围

单击某个图表选择显示哪些指标,或添加新图标并选择其指标:Click a chart to choose which metrics it displays, or add a new chart and select its metrics:

单击某个图选择指标

Note

取消选中所有指标查看可用的完整选择。Uncheck all the metrics to see the full selection that is available. 指标分为多个组;选择某个组的任意成员时,仅显示该组的其他成员。The metrics fall into groups; when any member of a group is selected, only the other members of that group appear.

这一切意味着什么?What does it all mean? 性能磁贴和报表Performance tiles and reports

有多个性能指标可供获取。There are various performance metrics you can get. 让我们先从默认在应用程序边栏选项卡上显示的指标开始。Let's start with those that appear by default on the application blade.

请求Requests

指定时间段内收到的 HTTP 请求数。The number of HTTP requests received in a specified period. 将此结果与其他报表中的结果比较,查看应用行为如何因负载而异。Compare this with the results on other reports to see how your app behaves as the load varies.

HTTP 请求包括对页面、数据和图像的所有 GET 或 POST 请求。HTTP requests include all GET or POST requests for pages, data, and images.

单击该磁贴获取特定 URL 的计数。Click on the tile to get counts for specific URLs.

平均响应时间Average response time

测量从 Web 请求进入应用程序到返回响应之间的时间。Measures the time between a web request entering your application and the response being returned.

这些点显示移动平均值。The points show a moving average. 如果有大量请求,可能存在一些在图中未出现明显峰值或低谷的情况下偏离平均值的点。If there are many requests, there might be some that deviate from the average without an obvious peak or dip in the graph.

寻找异常峰值。Look for unusual peaks. 通常,响应时间的上升应伴随请求数的上升。In general, expect response time to rise with a rise in requests. 如果此上升不相称,则应用可能遇到了资源限制,如 CPU 或它所使用的服务容量。If the rise is disproportionate, your app might be hitting a resource limit such as CPU or the capacity of a service it uses.

单击该磁贴可获取特定 URL 的时间。Click the tile to get times for specific URLs.

速度最慢的请求Slowest requests

显示哪些请求可能需要性能优化。Shows which requests might need performance tuning.

失败的请求Failed requests

引发未捕获异常的请求计数。A count of requests that threw uncaught exceptions.

单击该磁贴查看特定失败的详细信息,并选择个别请求查看器详细信息。Click the tile to see the details of specific failures, and select an individual request to see its detail.

对于单个检查,仅保留失败的代表性示例。Only a representative sample of failures is retained for individual inspection.

其他指标Other metrics

要查看可显示的其他指标,请单击某个图,并取消选中所有指标查看完整的可用集。To see what other metrics you can display, click a graph, and then deselect all the metrics to see the full available set. 单击 (i) 查看每个指标的定义。Click (i) to see each metric's definition.

取消选中所有指标查看完整集

选择任何指标会禁用无法在同一图表中显示的其他指标。Selecting any metric disables the others that can't appear on the same chart.

设置警报Set alerts

若要收到任意指标异常值的电子邮件通知,请添加警报。To be notified by email of unusual values of any metric, add an alert. 可选择将电子邮件发送给帐户管理员或特定电子邮件地址。You can choose either to send the email to the account administrators, or to specific email addresses.

在其他属性之前设置资源。Set the resource before the other properties. 如果要设置关于性能或使用情况指标的警报,请不要选择 webtest 资源。Don't choose the webtest resources if you want to set alerts on performance or usage metrics.

请仔细留意系统要求输入阈值所采用的单位。Be careful to note the units in which you're asked to enter the threshold value.

我看不到“添加警报”按钮。I don't see the Add Alert button. - 这是否是你对其具有只读访问权限的组帐户?- Is this a group account to which you have read-only access? 请咨询帐户管理员。Check with the account administrator.

诊断问题Diagnosing issues

下面是查找和诊断性能问题的一些提示:Here are a few tips for finding and diagnosing performance issues:

  • 设置 Web 测试,以便在网站出现故障或响应错误或缓慢时得到警报。Set up web tests to be alerted if your web site goes down or responds incorrectly or slowly.
  • 将请求计数与其他指标比较,查看故障或响应缓慢是否与负载有关。Compare the Request count with other metrics to see if failures or slow response are related to load.
  • 在代码中插入和搜索跟踪语句以帮助查明问题。Insert and search trace statements in your code to help pinpoint problems.
  • 使用实时指标流监视正在运行的 Web 应用。Monitor your Web app in operation with Live Metrics Stream.

通过性能调查体验来发现和修复性能瓶颈Find and fix performance bottlenecks with performance investigation experience

可以使用新的性能调查体验来审查 Web 应用中性能低下的操作。You can use the performance investigation experience to review slow performing operations in your Web app. 使用新的为选定操作显示的持续时间分布,只需一瞥,便可快速评估客户的体验有多糟糕。Using the new duration distribution shown for the selected operation you can quickly at a glance assess just how bad the experience is for your customers. 你可以看到每个性能低下的操作影响了多少用户交互。You can see how many of your user interactions were impacted for each slow operation. 在下面的示例中,我们决定更详细地查看“GET Customers/Details”操作的体验。In the following example, we've decided to take a closer look at the experience for GET Customers/Details operation. 在持续时间分布中,我们可以看到有三个峰值。In the duration distribution, we can see that there are three spikes. 最左侧的峰值约为 400 ms,表示响应体验很棒。Leftmost spike is around 400 ms and represents great responsive experience. 中间峰值约为 1.2 s,表示体验一般。Middle spike is around 1.2 s and represents a mediocre experience. 最后一个是 3.6 s,这里出现一个小的峰值,表示 99% 的体验,这可能会导致我们的客户因不满意而离开。Finally at the 3.6 s we have another small spike that represents the 99th percentile experience, which is likely to cause our customers to leave dissatisfied. 该体验比同一操作的很棒体验慢十倍。That experience is ten times slower than the great experience for the same operation.

“GET Customers/Details”三个持续时间峰值

若要更好地了解此操作的用户体验,我们可以选择一个更大的时间范围。To get a better sense of the user experiences for this operation, we can select a larger time range. 然后,还可以缩小操作特别慢的特定时间范围。We can then also narrow down in time on a specific time window where the operation was slow. 在下面的示例中,我们将时间范围从默认的 24 小时切换到了 7 天,然后在周二(12 日)与周三(13 日)之间放大到 9:47 到 12:47。In the following example, we've switched from the default 24 hours time range to the 7 days time range and then zoomed into the 9:47 to 12:47 time window between Tue the 12th and Wed the 13th. 右侧的持续时间分布和示例数以及探查器跟踪数均已更新。Both the duration distribution and the number of sample and profiler traces have been updated on the right.

“GET Customers/Details”在 7 天范围中的三个持续时间峰值和时间窗口

为了收缩低性能体验的范围,我们接下来将对介于第 95 个百分位与第 99 个百分位之间的持续时间进行放大。To narrow in on the slow experiences, we next zoom into the durations that fall between 95th and the 99th percentile. 这些表示有 4% 的用户交互特别慢。These represent the 4% of user interactions that were slow.

“GET Customers/Details”在 7 天范围中的三个持续时间峰值和时间窗口

现在,我们可以通过单击“Samples”按钮来查看有代表性的示例,或者通过单击“Profiler traces”按钮来查看有代表性的探查器跟踪。We can now either look at the representative samples, by clicking on the Samples button, or at the representative profiler traces, by clicking on the Profiler traces button. 在此示例中,在该时间范围和所关注的范围持续时间内,为“GET Customers/Details”收集了四个跟踪。In this example, there are four traces that have been collected for GET Customers/Details in the time window and range duration of interest.

有时问题不在代码中,而是在代码所调用的依赖项中。Sometimes the issue will not be in your code, but rather in a dependency your code calls. 可以在性能会审视图中切换到“依赖项”选项卡来调查此类性能低下的依赖项。You can switch to the Dependencies tab in the performance triage view to investigate such slow dependencies. 默认情况下,性能视图显示的是趋势平均值,但你实际希望查看的是第 95 个百分位(或第 99 个百分位,如果监控的是成熟服务)。By default the performance view is trending averages, but what you really want to look at is the 95th percentile (or the 99th, in case you are monitoring a mature service). 在下面的示例中,我们聚焦于性能低下的 Azure BLOB 依赖项,我们在该依赖项中调用了 PUT fabrikamaccount。In the following example we have focused on the slow Azure BLOB dependency, where we call PUT fabrikamaccount. 良好的体验大约为 40 ms,而对同一依赖项的性能低下调用要慢三倍,约为 120 ms。The good experiences cluster around 40 ms, while the slow calls to the same dependency are three times slower, clustering around 120 ms. 它没有进行许多这类调用,因此没有积少成多,未导致相应操作显著变慢。It doesn't take many of these calls to add up to cause the respective operation to noticeably slow down. 可以钻取到有代表性的示例和探查器跟踪,就像使用“操作”选项卡可做的那样。You can drill into the representative samples and profiler traces, just like you can with the Operations tab.

“GET Customers/Details”在 7 天范围中的三个持续时间峰值和时间窗口

性能调查体验显示了你决定关注的样本集的相关见解。The performance investigation experience shows relevant insights along side the sample set you decided to focus on. 查看所有可用洞察信息的最佳方式是切换到一个 30 天时间范围,然后选择“总体”来查看过去一个月的所有操作的洞察信息。The best way to look at all of the available insights is to switch to a 30 days time range and then select Overall to see insights across all operations for the past month.

“GET Customers/Details”在 7 天范围中的三个持续时间峰值和时间窗口

后续步骤Next steps

Web 测试 - 使 Web 请求按固定间隔从世界各地发送到应用程序。Web tests - Have web requests sent to your application at regular intervals from around the world.

捕获和搜索诊断跟踪 - 插入跟踪调用并筛查结果以查明问题。Capture and search diagnostic traces - Insert trace calls and sift through the results to pinpoint issues.

使用情况跟踪 - 查明用户使用应用程序的方式。Usage tracking - Find out how people use your application.

故障排除 - 和问答Troubleshooting - and Q & A