使用 Application Insights 探查 Azure 中的生产应用程序Profile production applications in Azure with Application Insights

对应用程序启用 Application Insights ProfilerEnable Application Insights Profiler for your application

Azure Application Insights Profiler 针对 Azure 生产环境中运行的应用程序提供性能跟踪。Azure Application Insights Profiler provides performance traces for applications that are running in production in Azure. Profiler 可以大规模自动捕获数据,且不会给用户造成负面影响。Profiler captures the data automatically at scale without negatively affecting your users. Profiler 可帮助识别在处理特定 Web 请求时,花费时间最长的“热”代码路径。Profiler helps you identify the “hot” code path that takes the longest time when it's handling a particular web request.

Profiler 适用于以下 Azure 服务中部署的 .NET 应用程序。Profiler works with .NET applications that are deployed on the following Azure services. 下面提供了有关为每个服务类型启用 Profiler 的具体说明的链接。Specific instructions for enabling Profiler for each service type are in the links below.

如果已启用 Profiler 但未看到跟踪,请查看故障排除指南If you've enabled Profiler but aren't seeing traces, check our Troubleshooting guide.

查看 Profiler 数据View Profiler data

要使 Profiler 上传跟踪,应用程序必须主动处理请求。For Profiler to upload traces, your application must be actively handling requests. 如果你正在进行试验,可以通过 Application Insights 性能测试生成针对 Web 应用的请求。If you're doing an experiment, you can generate requests to your web app by using Application Insights performance testing. 如果最近启用了 Profiler,可以运行简短的负载测试。If you've newly enabled Profiler, you can run a short load test. 运行负载测试时,请选择 Profiler 设置页中的“立即探查”按钮。While the load test is running, select the Profile Now button on the Profiler Settings pane. Profiler 开始运行后,它会每小时随机探查大约一次,持续时间为两分钟。When Profiler is running, it profiles randomly about once per hour and for a duration of two minutes. 如果应用程序处理的请求流比较稳定,则 Profiler 会每隔一小时上传跟踪。If your application is handling a steady stream of requests, Profiler uploads traces every hour.

应用程序收到一些流量后,如果 Profiler 有时间上传跟踪,则你应会获得一些可查看的跟踪。After your application receives some traffic and Profiler has had time to upload the traces, you should have traces to view. 此过程最多可能需要 5 到 10 分钟。This process can take 5 to 10 minutes. 若要查看跟踪,请在“性能”窗格中选择“采取措施”,然后选择“Profiler 跟踪”按钮。 To view traces, in the Performance pane, select Take Actions, and then select the Profiler Traces button.

“Application Insights 性能”窗格预览 Profiler 跟踪

选择一个样本可以显示执行该请求所花费时间的代码级细节。Select a sample to display a code-level breakdown of time spent executing the request.

Application Insights 跟踪浏览器

跟踪浏览器会显示以下信息:The trace explorer displays the following information:

  • 显示热路径:打开最大的叶节点,或者至少打开某个接近的节点。Show Hot Path: Opens the biggest leaf node, or at least something close. 大多数情况下,此节点与性能瓶颈相邻。In most cases, this node is near a performance bottleneck.
  • 标签:函数或事件的名称。Label: The name of the function or event. 树中显示了代码与发生的事件(例如 SQL 和 HTTP 事件)的混合形式。The tree displays a mix of code and events that occurred, such as SQL and HTTP events. 最前面的事件表示请求总持续时间。The top event represents the overall request duration.
  • 所用时间:从操作开始到操作结束之间的时间间隔。Elapsed: The time interval between the start of the operation and the end of the operation.
  • 时间:运行函数或事件的时间,此值相对于运行其他函数的时间。When: The time when the function or event was running in relation to other functions.

如何读取性能数据How to read performance data

Microsoft 服务探查器结合使用采样方法和检测来分析应用程序的性能。The Microsoft service profiler uses a combination of sampling methods and instrumentation to analyze the performance of your application. 当详细收集操作正在进行时,服务探查器将每隔一毫秒对每台计算机的 CPU 指令指针采样。When detailed collection is in progress, the service profiler samples the instruction pointer of each machine CPU every millisecond. 每个样本捕获当前正在执行的线程的完整调用堆栈。Each sample captures the complete call stack of the thread that's currently executing. 它从高级抽象化和低级抽象化两个角度,提供有关线程当前行为的有用详细信息。It gives detailed information about what that thread was doing, at both a high level and a low level of abstraction. 服务探查器还会收集其他事件(包括上下文切换事件、任务并行库 (TPL) 事件和线程池事件),以跟踪活动相关性和因果关系。The service profiler also collects other events to track activity correlation and causality, including context switching events, Task Parallel Library (TPL) events, and thread pool events.

时间线视图中显示的调用堆栈是采样和检测的结果。The call stack that's displayed in the timeline view is the result of the sampling and instrumentation. 由于每个样本会捕获线程的整个调用堆栈,因此包含 Microsoft .NET Framework 中的代码,以及引用的其他框架中的代码。Because each sample captures the complete call stack of the thread, it includes code from Microsoft .NET Framework and from other frameworks that you reference.

对象分配(clr!JIT_New 或 clr!JIT_Newarr1)Object allocation (clr!JIT_New or clr!JIT_Newarr1)

clr!JIT_Newclr!JIT_Newarr1 是 .NET Framework 中的 helper 函数,用于分配托管堆中的内存。clr!JIT_New and clr!JIT_Newarr1 are helper functions in .NET Framework that allocate memory from a managed heap. 分配对象时,将调用 clr!JIT_New。clr!JIT_New is invoked when an object is allocated. 分配对象数组时,将调用 clr!JIT_Newarr1。clr!JIT_Newarr1 is invoked when an object array is allocated. 这两个函数通常速度很快,花费的时间相对较短。These two functions are usually fast and take relatively small amounts of time. 如果时间线中的 clr!JIT_Newclr!JIT_Newarr1 花费了很长时间,则可能表示代码分配了很多对象,从而消耗了大量的内存。If clr!JIT_New or clr!JIT_Newarr1 takes a lot of time in your timeline, the code might be allocating many objects and consuming significant amounts of memory.

加载代码 (clr!ThePreStub)Loading code (clr!ThePreStub)

clr!ThePreStub 是 .NET Framework 中的 helper 函数,用于准备首次执行的代码。clr!ThePreStub is a helper function in .NET Framework that prepares the code to execute for the first time. 此执行通常包括但不限于实时 (JIT) 编译。This execution usually includes, but isn't limited to, just-in-time (JIT) compilation. 对于每个 C# 方法,在进程的生存期内,最多只应调用 clr!ThePreStub 一次。For each C# method, clr!ThePreStub should be invoked at most once during a process.

如果 clr!ThePreStub 针对某个请求花费了很长时间,则表示这是第一个执行该方法的请求。If clr!ThePreStub takes a long time for a request, the request is the first one to execute that method. .NET Framework 运行时加载该方法的时间非常重要。The time for .NET Framework runtime to load the first method is significant. 可以考虑在用户访问该代码部分之前使用执行该代码部分的预热进程,或者考虑在程序集中运行本机映像生成器 (ngen.exe)。You might consider using a warmup process that executes that portion of the code before your users access it, or consider running Native Image Generator (ngen.exe) on your assemblies.

锁争用(clr!JITutil_MonContention 或 clr!JITutil_MonEnterWorker)Lock contention (clr!JITutil_MonContention or clr!JITutil_MonEnterWorker)

clr!JITutil_MonContention 或 clr!JITutil_MonEnterWorker 指示当前线程正在等待释放锁 。clr!JITutil_MonContention or clr!JITutil_MonEnterWorker indicates that the current thread is waiting for a lock to be released. 执行 C# LOCK 语句、调用 Monitor.Enter 方法或者结合 MethodImplOptions.Synchronized 属性调用某个方法时,通常会显示此文本。This text is often displayed when you execute a C# LOCK statement, invoke the Monitor.Enter method, or invoke a method with the MethodImplOptions.Synchronized attribute. 如果线程 A 获取了某个锁,而线程 B 在线程 A 释放该锁之前尝试获取同一个锁,此时通常会发生锁争用。Lock contention usually occurs when thread A acquires a lock and thread B tries to acquire the same lock before thread A releases it.

加载代码 ([COLD])Loading code ([COLD])

如果方法名称包含 [COLD] (例如 mscorlib.ni![COLD]System.Reflection.CustomAttribute.IsDefined),则表示 .NET Framework 运行时首次执行的代码未经过按配置优化功能的优化。If the method name contains [COLD], such as mscorlib.ni![COLD]System.Reflection.CustomAttribute.IsDefined, .NET Framework runtime is executing code for the first time that isn't optimized by profile-guided optimization. 对于每个方法,在进程的生存期内,它最多只应显示一次。For each method, it should be displayed at most once during the process.

如果针对某个请求加载代码花费的时间很长,则表示这是第一个执行该方法的未优化部分的请求。If loading code takes a substantial amount of time for a request, the request is the first one to execute the unoptimized portion of the method. 请考虑在用户访问该代码部分之前使用执行该代码部分的预热进程。Consider using a warmup process that executes that portion of the code before your users access it.

发送 HTTP 请求Send HTTP request

HttpClient.Send 等方法指示代码正在等待某个 HTTP 请求完成。Methods such as HttpClient.Send indicate that the code is waiting for an HTTP request to be completed.

数据库操作Database operation

SqlCommand.Execute 等方法指示代码正在等待某个数据库操作完成。Methods such as SqlCommand.Execute indicate that the code is waiting for a database operation to finish.

等待 (AWAIT_TIME)Waiting (AWAIT_TIME)

AWAIT_TIME 指示代码正在等待另一个任务完成。AWAIT_TIME indicates that the code is waiting for another task to finish. 这种延迟通常发生在 C# AWAIT 语句上。This delay usually happens with the C# AWAIT statement. 当代码执行 C# AWAIT 时,线程会回退并将控制权返回给线程池,此时,不会有任何阻塞的线程等待 AWAIT 完成。When the code does a C# AWAIT, the thread unwinds and returns control to the thread pool, and there's no thread that is blocked waiting for the AWAIT to finish. 但是,从逻辑上讲,执行 AWAIT 的线程会被“阻塞”并等待该操作完成。However, logically, the thread that did the AWAIT is "blocked," and it's waiting for the operation to finish. AWAIT_TIME 语句指示等待任务完成的阻塞时间。The AWAIT_TIME statement indicates the blocked time waiting for the task to finish.

阻塞时间Blocked time

BLOCKED_TIME 指示代码正在等待另一个资源变为可用。BLOCKED_TIME indicates that the code is waiting for another resource to be available. 例如,它可能会等待同步对象或线程变为可用,或等待请求完成。For example, it might be waiting for a synchronization object, for a thread to be available, or for a request to finish.

非托管异步Unmanaged Async

.NET Framework 发出 ETW 事件并在线程之间传递活动 ID,以便可以跨线程跟踪异步调用。.NET framework emits ETW events and passes activity ids between threads so that async calls can be tracked across threads. 非托管代码(本机代码)和一些较旧样式的异步代码缺少这些事件和活动 ID,因此探查器无法分辨线程上运行的线程和函数。Unmanaged code (native code) and some older styles of asynchronous code are missing these events and activity ids, so the profiler cannot tell what thread and what functions are running on the thread. 这在调用堆栈中标记为“非托管异步”。This is labeled 'Unmanaged Async' in the call stack. 如果下载 ETW 文件,则可以使用 PerfView 更深入地了解正在发生的情况。If you download the ETW file, you may be able to use PerfView to get more insight into what is happening.

CPU 时间CPU time

CPU 正忙于执行指令。The CPU is busy executing the instructions.

磁盘时间Disk time

应用程序正在执行磁盘操作。The application is performing disk operations.

网络时间Network time

应用程序正在执行网络操作。The application is performing network operations.

“时间”列When column

“时间”列是针对节点收集的非独占样本在各个时间发生的变化的可视化效果。The When column is a visualization of how the INCLUSIVE samples collected for a node vary over time. 请求的总范围划分成 32 个时间存储桶。The total range of the request is divided into 32 time buckets. 该节点的非独占样本会在这 32 个存储桶中累积。The inclusive samples for that node are accumulated in those 32 buckets. 每个存储桶用一个条形表示。Each bucket is represented as a bar. 条形的高度表示缩放后的值。The height of the bar represents a scaled value. 如果节点带有 CPU_TIMEBLOCKED_TIME 标记,或者跟资源(例如,CPU、磁盘、线程)的消耗存在某种明显关系,则条形表示在该 Bucket 的时间段内消耗了其中的某个资源。For nodes that are marked CPU_TIME or BLOCKED_TIME, or where there is an obvious relationship to consuming a resource (for example, a CPU, disk, or thread), the bar represents the consumption of one of the resources during the bucket. 如果消耗多个资源,这些指标的值可能大于 100%。For these metrics, it's possible to get a value of greater than 100 percent by consuming multiple resources. 例如,如果在某个时间间隔内平均使用两个 CPU,则指标值为 200%。For example, if you use, on average, two CPUs during an interval, you get 200 percent.

限制Limitations

默认数据保留期为 5 天。The default data retention period is five days. 每天引入的最大数据量为 10 GB。The maximum data that's ingested per day is 10 GB.

使用 Profiler 服务没有任何费用。There are no charges for using the Profiler service. 要使用该服务,Web 应用必须至少托管在 Azure 应用服务 Web 应用功能的基本层中。For you to use it, your web app must be hosted in at least the basic tier of the Web Apps feature of Azure App Service.

开销和采样算法Overhead and sampling algorithm

在托管的应用程序已启用 Profiler 来捕获跟踪的每个虚拟机上,Profiler 每小时随机运行 2 分钟。Profiler randomly runs two minutes every hour on each virtual machine that hosts the application that has Profiler enabled for capturing traces. Profiler 在运行时会给服务器增加 5% 到 15% 的 CPU 开销。When Profiler is running, it adds from 5 to 15 percent CPU overhead to the server.

后续步骤Next steps

为 Azure 应用程序启用 Application Insights Profiler。Enable Application Insights Profiler for your Azure application. 另请参阅:Also see: