使用 Azure Monitor 进行持续监视Continuous monitoring with Azure Monitor

持续监视是指在 DevOps 和 IT 运营生命周期的每个阶段中整合监视功能所要采用的流程和技术。Continuous monitoring refers to the process and technology required to incorporate monitoring across each phase of your DevOps and IT operations lifecycles. 它有助于持续确保应用程序和基础结构在从开发环境转移到生产环境时保持正常的运行状况、性能和可靠性。It helps to continuously ensure the health, performance, and reliability of your application and infrastructure as it moves from development to production. 持续监视构建在持续集成和持续部署 (CI/CD) 的概念基础之上。CI/CD 可帮助你更快、更可靠地开发和交付软件,为用户持续提供价值。Continuous monitoring builds on the concepts of Continuous Integration and Continuous Deployment (CI/CD) which help you develop and deliver software faster and more reliably to provide continuous value to your users.

Azure Monitor 是 Azure 中的统一监视解决方案,可跨云中和本地的应用程序与基础结构观察整个堆栈。Azure Monitor is the unified monitoring solution in Azure that provides full-stack observability across applications and infrastructure in the cloud and on-premises. 它甚至可跨所选的 ITSM 和 SIEM 工具集成,以帮助跟踪现有 IT 流程中的问题和事件。It even integrates across the ITSM and SIEM tools of your choice to help track issues and incidents within your existing IT processes.

本文介绍使用 Azure Monitor 在整个工作流中启用持续监视的具体步骤。This article describes specific steps for using Azure Monitor to enable continuous monitoring throughout your workflows. 其中还包含了详细介绍如何实施不同功能的其他文档的链接。It includes links to other documentation that provides details on implementing different features.

为所有应用程序启用监视Enable monitoring for all your applications

若要观察整个环境,需在所有 Web 应用程序和服务中启用监视。In order to gain observability across your entire environment, you need to enable monitoring on all your web applications and services. 这样,便可以轻松可视化所有组件中的端到端事务和连接。This will allow you to easily visualize end-to-end transactions and connections across all the components.

为整个基础结构启用监视Enable monitoring for your entire infrastructure

应用程序的可靠性只与其底层基础结构相当。Applications are only as reliable as their underlying infrastructure. 为整个基础结构启用监视有助于实现全面的观察,发生故障时,还可以更轻松地发现潜在的根本原因。Having monitoring enabled across your entire infrastructure will help you achieve full observability and make it easier to discover a potential root cause when something fails. Azure Monitor 可帮助你跟踪整个混合基础结构(包括 VM、容器、存储和网络等资源)的运行状况与性能。Azure Monitor helps you track the health and performance of your entire hybrid infrastructure including resources such as VMs, containers, storage, and network.

在 Azure 资源组中合并资源Combine resources in Azure Resource Groups

当今 Azure 中的典型应用程序包含多个资源,例如,托管在云服务、AKS 群集中或 Service Fabric 中的 VM 和应用服务或微服务。A typical application on Azure today includes multiple resources such as VMs and App Services or microservices hosted on Cloud Services, AKS clusters, or Service Fabric. 这些应用程序经常利用事件中心、存储、SQL 和服务总线等依赖项。These applications frequently utilize dependencies like Event Hubs, Storage, SQL, and Service Bus.

  • 在 Azure 资源组中合并资源可以全面洞察构成不同应用程序的所有资源。Combine resources inAzure Resource Groups to get full visibility across all your resources that make up your different applications.

通过持续部署确保质量Ensure quality through Continuous Deployment

使用持续集成/持续部署可以根据自动测试的结果,将代码更改自动集成和部署到应用程序。Continuous Integration / Continuous Deployment allows you to automatically integrate and deploy code changes to your application based on the results of automated testing. 它简化了部署过程,并确保任何更改在转移到生产环境之前具有可靠的质量。It streamlines the deployment process and ensures the quality of any changes before they move into production.

使用操作创建可操作警报Create actionable alerts with actions

监视的一个重要方面是将任何当前问题和预测到的问题主动通知给管理员。A critical aspect of monitoring is proactively notifying administrators of any current and predicted issues.

  • 基于日志和指标在 Azure Monitor 中创建警报可以识别到可预测的故障状态。Create alerts in Azure Monitor based on logs and metrics to identify predictable failure states. 在使所有警报可操作方面应有一个目标,即,这些警报表示实际的关键状况,并且应该尽量减少误报。You should have a goal of making all alerts actionable meaning that they represent actual critical conditions and seek to reduce false positives.
  • 为警报定义操作可以使用最有效的方式来通知管理员。Define actions for alerts to use the most effective means of notifying your administrators. 可用的通知操作包括短信、电子邮件、推送通知或语音呼叫。Available actions for notification are SMS, e-mails, push notifications, or voice calls.
  • 还可以使用 Azure 自动化 Runbook 来修正警报中识别到的问题。Remediate situations identified in alerts as well with Azure Automation runbooks that can be launched from an alert using webhooks.
  • 使用自动缩放可以根据收集的指标动态增加和减少计算资源。Use autoscaling to dynamically increase and decrease your compute resources based on collected metrics.

准备仪表板和工作簿Prepare dashboards and workbooks

确保开发和运营部门有权访问相同的遥测功能和工具可让他们查看整个环境中的模式,并最大程度地减小平均检测时间 (MTTD) 和平均还原时间 (MTTR)。Ensuring that your development and operations have access to the same telemetry and tools allows them to view patterns across your entire environment and minimize your Mean Time To Detect (MTTD) and Mean Time To Restore (MTTR).

  • 根据组织中不同角色的通用指标和日志准备自定义仪表板Prepare custom dashboards based on common metrics and logs for the different roles in your organization. 仪表板可以合并所有 Azure 资源的数据。Dashboards can combine data from all Azure resources.
  • 准备工作簿以确保在开发与运营部门之间分享知识。Prepare Workbooks to ensure knowledge sharing between development and operations. 可将这些工作簿准备为包含指标图表和日志查询的动态报表,甚至可由开发人员准备为故障排除指南,以帮助客户支持或运营人员处理基本问题。These could be prepared as dynamic reports with metric charts and log queries, or even as troubleshooting guides prepared by developers helping customer support or operations to handle basic problems.

持续优化Continuously optimize

监视是热门的“构建-度量-学习”理念的基本方面,该理念鼓励持续跟踪 KPI 和用户行为指标,然后努力通过规划迭代对其进行优化。Monitoring is one of the fundamental aspects of the popular Build-Measure-Learn philosophy, which recommends continuously tracking your KPIs and user behavior metrics and then striving to optimize them through planning iterations. Azure Monitor 可以帮助收集业务相关的指标和日志,并在下一次部署中按需添加新的数据点。Azure Monitor helps you collect metrics and logs relevant to your business and to add new data points in the next deployment as required.

后续步骤Next steps