如何监视 Azure 中的虚拟机How to monitor virtual machines in Azure

随着托管在 Azure 中的 VM 数目的显著增长,必须确定影响它们支持的应用程序和基础结构服务的性能和运行状况问题。With the significant growth of VMs hosted in Azure, it's important to identify performance and health issues that impact applications and infrastructure services they support. 默认情况下,会在 Azure 中通过各种指标类型(由主机监控程序收集的 CPU 使用率、磁盘使用率、内存使用率和网络流量)进行基本的监视。Basic monitoring is delivered by default with Azure by the metric types CPU usage, disk utilization, memory utilization, and network traffic collected by the host hypervisor. 可以使用扩展收集其他指标和日志数据,以便在来宾操作系统的 VM 上配置诊断。Additional metric and log data can be collected using extensions to configure diagnostics on your VMs from the guest operating system.

为了检测并诊断 VM 中运行的来宾操作系统、基于 .NET 的 Web 应用程序或 Java Web 应用程序组件的性能和运行状况问题,Azure Monitor 提供了集中监视功能和各种其他功能,例如用于 VM 的 Azure Monitor、Application Insights。To detect and help diagnose performance and health issues with the guest operating system, .NET based or Java web application components running inside the VM, Azure Monitor delivers centralized monitoring with comprehensive features such as Azure Monitor for VMs and Application Insights.

诊断和指标Diagnostics and metrics

可以在 Azure 门户、Azure CLI、Azure PowerShell 和编程应用程序编程接口 (API) 中使用指标来设置和监视诊断数据收集。You can set up and monitor the collection of diagnostics data using metrics in the Azure portal, the Azure CLI, Azure PowerShell, and programming Applications Programming Interfaces (APIs). 例如,可以:For example, you can:

  • 观察 VM 的基本指标。Observe basic metrics for the VM. Azure 门户的“概述”屏幕上显示的基本指标包括 CPU 使用率、网络使用情况、总磁盘字节数以及每秒的磁盘操作数。On the Overview screen of the Azure portal, the basic metrics shown include CPU usage, network usage, total of disk bytes, and disk operations per second.

  • 使用 Azure 门户启用并查看启动诊断数据收集。Enable the collection of boot diagnostics and view it using the Azure portal. 将自己的映像加载到 Azure 或者启动某个平台映像时,可能会因为许多原因而导致 VM 进入无法启动状态。When bringing your own image to Azure or even booting one of the platform images, there can be many reasons why a VM gets into a non-bootable state. 创建 VM 时,针对“设置”屏幕的“监视”部分下的“启动诊断”单击“已启用”,即可轻松启用启动诊断。You can easily enable boot diagnostics when you create a VM by clicking Enabled for Boot Diagnostics under the Monitoring section of the Settings screen.

    VM 启动时,启动诊断代理将捕获启动输出并将其存储在 Azure 存储中。As VMs boot, the boot diagnostic agent captures boot output and stores it in Azure storage. 此数据可以用于排查 VM 启动问题。This data can be used to troubleshoot VM boot issues. 从命令行工具创建 VM 时,不会自动启用启动诊断。Boot diagnostics are not automatically enabled when you create a VM from command-line tools. 在启用启动诊断之前,需要创建一个存储帐户来存储启动日志。Before enabling boot diagnostics, a storage account needs to be created for storing boot logs. 如果在 Azure 门户中启用启动诊断,则会自动创建一个存储帐户。If you enable boot diagnostics in the Azure portal, a storage account is automatically created for you.

    如果未在创建 VM 时启用启动诊断,可在以后随时使用 Azure CLIAzure PowerShellAzure 资源管理器模板启用它。If you didn't enable boot diagnostics when the VM was created, you can always enable it later by using Azure CLI, Azure PowerShell, or an Azure Resource Manager template.

  • 启用来宾 OS 诊断数据收集。Enable the collection of guest OS diagnostics data. 创建 VM 时,可以在“设置”屏幕上启用来宾 OS 诊断。When you create a VM, you have the opportunity on the settings screen to enable guest OS diagnostics. 如果确实启用了诊断数据收集,用于 Linux 的 IaaSDiagnostics 扩展用于 Windows 的 IaaSDiagnostics 扩展将添加到 VM,使你可以收集更多的磁盘、CPU 和内存数据。When you do enable the collection of diagnostics data, the IaaSDiagnostics extension for Linux or the IaaSDiagnostics extension for Windows is added to the VM, which enables you to collect additional disk, CPU, and memory data.

    使用收集的诊断数据,可以为 VM 配置自动缩放。Using the collected diagnostics data, you can configure autoscaling for your VMs. 还可以配置 Azure Monitor 日志,以便存储数据并设置警报,在性能不正常时通知你。You can also configure Azure Monitor Logs to store the data and set up alerts to let you know when performance isn't right.

警报Alerts

可以根据特定的性能指标创建警报You can create alerts based on specific performance metrics. 举例来说,可以根据以下问题生成警报,平均 CPU 使用率超过特定的阈值,或者可用磁盘空间低于特定的空间量。Examples of the issues you can be alerted about include when average CPU usage exceeds a certain threshold, or available free disk space drops below a certain amount. 可以通过 Azure 门户Azure 资源管理器模板Azure CLI 配置警报。Alerts can be configured in the Azure portal, using Azure Resource Manager templates, or Azure CLI.

Azure 服务运行状况Azure Service Health

Azure 服务运行状况会在 Azure 服务问题影响你时提供个性化的指导和支持,并且会帮助你为即将到来的计划内维护做好准备。Azure Service Health provides personalized guidance and support when issues in Azure services affect you, and helps you prepare for upcoming planned maintenance. Azure 服务运行状况使用具有针对性和灵活性的通知提醒你和你的团队。Azure Service Health alerts you and your teams using targeted and flexible notifications.

Azure 资源运行状况Azure Resource Health

Azure 资源运行状况有助于在 Azure 问题影响资源时进行诊断和获取支持。Azure Resource health helps you diagnose and get support when an Azure issue impacts your resources. 它通知你有关资源的当前和过去运行状况的信息,并帮助你缓解问题。It informs you about the current and past health of your resources and helps you mitigate issues. 在需要有关 Azure 服务问题的帮助时,资源运行状况将提供技术支持。Resource health provides technical support when you need help with Azure service issues.

Azure 活动日志Azure Activity Log

Azure 活动日志是一种方便用户深入了解 Azure 中发生的订阅级别事件的订阅日志。The Azure Activity Log is a subscription log that provides insight into subscription-level events that have occurred in Azure. 该日志包括从 Azure 资源管理器操作数据到服务运行状况事件更新的一系列数据。The log includes a range of data, from Azure Resource Manager operational data to updates on Service Health events. 可以在 Azure 门户中单击“活动日志”查看 VM 的日志。You can click Activity Log in the Azure portal to view the log for your VM.

可以对活动日志执行的部分操作包括:Some of the things you can do with the activity log include:

还可以通过使用 Azure PowerShellAzure CLI监视 REST API 访问活动日志数据。You can also access activity log data by using Azure PowerShell, the Azure CLI, or Monitor REST APIs.

Azure 资源日志是 VM 发出的日志,其中提供与该 VM 的操作相关的各种频繁生成的数据。Azure Resource Logs are logs emitted by your VM that provide rich, frequent data about its operation. 不同于活动日志,资源日志提供有关在 VM 中执行的操作的见解。Resource logs differ from the activity log by providing insight about operations that were performed within the VM.

可以对诊断日志执行的部分操作包括:Some of the things you can do with diagnostics logs include:

  • 将诊断日志保存到存储帐户进行审核或手动检查。Save them to a storage account for auditing or manual inspection. 可以使用“资源诊断设置”指定保留时间(天)。You can specify the retention time (in days) using Resource Diagnostic Settings.

  • 使用 Log Analytics 对诊断日志进行分析。Analyze them with Log Analytics.

高级监视Advanced monitoring

若要查看 Azure VM 和虚拟机规模集支持的应用程序或服务,确定在 VM 中运行的来宾 OS 或工作负载的问题,以便了解它是影响应用程序的可用性或性能,还是应用程序本身出现了问题,请启用 Application InsightsFor visibility of the application or service supported by the Azure VM and virtual machine scale sets, identification of issues with the guest OS or workload running in the VM to understand if it is impacting availability or performance of the application, or is an issue with the application, enable Application Insights.

用于 VM 的 Azure Monitor 分析 Windows 和 Linux VM 的性能与运行状况,包括不同的进程以及与它发现的其他资源和外部进程之间的相互依赖关系,可以大规模监视 Azure 虚拟机 (VM)。Azure Monitor for VMs monitors your Azure virtual machines (VM) at scale by analyzing the performance and health of your Windows and Linux VMs, including the different processes and interconnected dependencies on other resources and external processes it discovers. 它包含多个趋势性能图表,用于调查问题和评估 VM 的容量。It includes several trend performance charts to help during investigation of problems and assess capacity of your VMs. 依赖项映射显示受监视的和不受监视的计算机、进程和这些计算机之间的失败网络连接和有效网络连接,并显示趋势图表,其中包含标准的网络连接指标。The dependency map shows monitored and unmonitored machines, failed and active network connections between processes and these machines, and shows trend charts with standard network connection metrics. 通过组合使用 Application Insights,可以监视应用程序并捕获遥测数据(例如 HTTP 请求、异常等),这样就可以将 VM 和应用程序之间的问题关联起来。Combined with Application Insights, you monitor your application and capture telemetry such as HTTP requests, exceptions, etc. so you can correlate issues between the VMs and your application. 请配置 Azure Monitor 警报,这样,当系统从用于 VM 的 Azure Monitor 收集的监视数据中检测到重要情况时,就会提醒你。Configure Azure Monitor alerts to alert you on important conditions detected from monitoring data collected by Azure Monitor for VMs.

后续步骤Next steps