虚拟机规模集的计划内维护通知Planned maintenance notifications for virtual machine scale sets

Azure 定期执行更新,以提高虚拟机 (VM) 的主机基础结构的可靠性、性能及安全性。Azure periodically performs updates to improve the reliability, performance, and security of the host infrastructure for virtual machines (VMs). 更新可能包括修补托管环境或升级以及解除硬件授权。Updates might include patching the hosting environment or upgrading and decommissioning hardware. 大多数更新不影响托管的 VM。Most updates don't affect the hosted VMs. 但是在以下情况下,更新会影响 VM:However, updates affect VMs in these scenarios:

  • 如果维护不需要重启,Azure 会在主机更新时暂停 VM 几秒钟。If the maintenance does not require a reboot, Azure pauses the VM for few seconds while the host is updated. 这些类型的维护操作将逐个容错域进行应用。These types of maintenance operations are applied fault domain by fault domain. 如果接收到任何警告健康状况信号,则进程将停止。Progress is stopped if any warning health signals are received.

  • 如果维护需重新启动,系统会告知计划维护的时间。If maintenance requires a reboot, you get a notice of when the maintenance is planned. 在这些情况下,系统会提供一个时间窗口(通常为 35 天),方便你在适当的时间自行启动维护。In these cases, you are given a time window that is typically 35 days where you can start the maintenance yourself, when it works for you.

需要重启的计划内维护是按批进行计划的。Planned maintenance that requires a reboot is scheduled in waves. 每个批具有不同的作用域(区域):Each wave has different scope (regions):

  • 一个批从向客户发送通知开始。A wave starts with a notification to customers. 默认情况下,向订阅所有者和共同所有者发送通知。By default, notification is sent to the subscription owner and co-owners. 可以使用 Azure 活动日志警报,向通知添加收件人和消息发送选项(如电子邮件、短信和 Webhook)。You can add recipients and messaging options like email, SMS, and webhooks to the notifications by using Azure Activity Log alerts.
  • 出现通知时会提供自助时段 。With notification, a self-service window is made available. 在此窗口(通常为 35 天)内,可以发现此批中包括了哪些 VM。During this window that is typically 35 days, you can find which of your VMs are included in the wave. 可以根据自身计划需要,主动启动维护。You can proactively start maintenance according to your own scheduling needs.
  • 自助时段过后,就会开始计划内维护时段。 After the self-service window, a scheduled maintenance window begins. 在此时段的某个时刻,Azure 会计划所需的维护,并将其应用于 VM。At some point during this window, Azure schedules and applies the required maintenance to your VM.

设置这两个时段的目的是,在了解 Azure 何时将自动启动维护时,提供足够的时间来启动维护和重新启动 VM。The goal in having two windows is to give you enough time to start maintenance and reboot your VM while knowing when Azure will automatically start maintenance.

可以使用 Azure 门户、PowerShell、REST API 和 Azure CLI 查询虚拟机规模集 VM 的维护时段并启动自助维护。You can use the Azure portal, PowerShell, the REST API, and the Azure CLI to query for maintenance windows for your virtual machine scale set VMs, and to start self-service maintenance.

是否应在自助时段启动维护?Should you start maintenance during the self-service window?

以下指南可帮助你决定是否在所选时间启动维护。The following guidelines can help you decide whether to start maintenance at a time that you choose.

备注

自助维护不一定适用于所有 VM。Self-service maintenance might not be available for all of your VMs. 要确定是否可以对 VM 进行主动重新部署,请在维护状态中查找“立即启动”。 To determine whether proactive redeploy is available for your VM, look for Start now in the maintenance status. 自助维护目前不适用于 Azure 云服务(Web/辅助角色)和 Azure Service Fabric。Currently, self-service maintenance isn't available for Azure Cloud Services (Web/Worker Role) and Azure Service Fabric.

不推荐将自助维护用于使用可用性集的部署 。Self-service maintenance isn't recommended for deployments that use availability sets. 可用性集是高度可用的设置,其中任何时候仅一个更新域会受到影响。Availability sets are highly available setups in which only one update domain is affected at any time. 对于可用性集,请注意以下事项:For availability sets:

  • 让 Azure 触发维护。Let Azure trigger the maintenance. 对于需要重启的维护,会按更新域依次执行维护。For maintenance that requires a reboot, maintenance is done update domain by update domain. 更新域不一定按顺序接收维护。Update domains don't necessarily receive the maintenance sequentially. 更新域之间存在 30 分钟的暂停。There's a 30-minute pause between update domains.
  • 如果担心暂时性丢失部分容量(1/更新域计数),通过在维护期间分配更多的实例即可轻松弥补容量丢失。If a temporary loss of some of your capacity (1/update domain count) is a concern, you can easily compensate for the loss by allocating additional instances during the maintenance period.
  • 对于无需重启的维护,将在容错域级别应用更新。For maintenance that doesn't require a reboot, updates are applied at the fault domain level.

以下情况 请勿 使用自助维护:Don't use self-service maintenance in the following scenarios:

  • 通过手动方式、开发测试实验室、自动关闭或按照一定时间计划频繁关闭虚拟机。If you shut down your VMs frequently, either manually, by using DevTest Labs, by using auto-shutdown, or by following a schedule. 在这些情况下,自助维护可能还原维护状态和造成额外故障事件。Self-service maintenance in these scenarios might revert the maintenance status and cause additional downtime.
  • VM 的生存期短,已确定在维护结束之前就会被删除。On short-lived VMs that you know will be deleted before the end of the maintenance wave.
  • 工作负荷的状态为“大”,存储在本地(临时)磁盘中,这些磁盘需要在更新后进行维护。For workloads with a large state stored in the local (ephemeral) disk that you want to maintain after update.
  • 经常重设 VM 大小。If you resize your VM often. 这种情况可能还原维护状态。This scenario might revert the maintenance status.
  • 已采用的计划事件允许在维护关闭开始前 15 分钟对工作负载执行主动故障转移或正常关闭。If you have adopted scheduled events that enable proactive failover or graceful shutdown of your workload 15 minutes before maintenance shutdown begins.

如果打算在计划性维护阶段不间断地运行 VM,而且上述禁忌均不适用,则可使用自助维护 。Do use self-service maintenance if you plan to run your VM uninterrupted during the scheduled maintenance phase and none of the preceding counterindications apply.

以下情况最好使用自助维护:It's best to use self-service maintenance in the following cases:

  • 需要向管理层或客户告知确切的维护时段。You need to communicate an exact maintenance window to management or your customer.
  • 需要在特定日期前完成维护。You need to complete the maintenance by a specific date.
  • 需要控制维护顺序,例如,应用程序为多层应用程序,需要确保安全地进行恢复。You need to control the sequence of maintenance, for example, in a multi-tier application, to guarantee safe recovery.
  • 在两个更新域之间,需要的 VM 恢复时间超出 30 分钟。You need more than 30 minutes of VM recovery time between two update domains. 为了控制更新域之间的时间,一次只能在一个更新域的 VM 上触发维护。To control the time between update domains, you must trigger maintenance on your VMs one update domain at a time.

在门户中查看受维护影响的虚拟机规模集View virtual machine scale sets that are affected by maintenance in the portal

安排了大量计划内维护后,可以使用 Azure 门户查看受即将到来的大量维护影响的虚拟机规模集列表。When a planned maintenance wave is scheduled, you can view the list of virtual machine scale sets that are affected by the upcoming maintenance wave by using the Azure portal.

  1. 登录到 Azure 门户Sign in to the Azure portal.

  2. 在左侧菜单中,选择“所有服务”,然后选择“虚拟机规模集” 。In the left menu, select All services, and then select Virtual machine scale sets.

  3. 在“虚拟机规模集”下,选择“编辑列”打开可用列的列表 。Under Virtual machine scale sets, select Edit columns to open the list of available columns.

  4. 在“可用列” 部分中,选择“自助维护” ,然后将其移至“选定的列” 列表中。In the Available columns section, select Self-service maintenance, and then move it to the Selected columns list. 选择“应用”。 Select Apply.

    为便于查找“自助维护”项,可将“可用列”部分中的下拉列表选项从“全部”更改为“属性” 。To make the Self-service maintenance item easier to find, you can change the drop-down option in the Available columns section from All to Properties.

现在,“自助维护”列将显示在虚拟机规模集的列表中 。The Self-service maintenance column now appears in the list of virtual machine scale sets. 每个虚拟机规模集可以具有以下自助维护列的值之一:Each virtual machine scale set can have one of the following values for the self-service maintenance column:

ValueValue 说明Description
Yes 虚拟机规模集中至少有一个 VM 处于自助时段。At least one VM in your virtual machine scale set is in a self-service window. 你可以在此自助时段随时启动维护。You can start maintenance at any time during this self-service window.
No 受影响的虚拟机规模集中的自助时段中没有任何 VM。No VMs are in a self-service window in the affected virtual machine scale set.
- 你的虚拟机规模集不属于大量的计划内维护。Your virtual machines scale sets aren't part of a planned maintenance wave.

门户中的通知和警报Notification and alerts in the portal

Azure 通过向订阅所有者和共有者组发送电子邮件来传达计划维护的安排。Azure communicates a schedule for planned maintenance by sending an email to the subscription owner and co-owners group. 可以通过创建活动日志警报,为此通信添加收件人和通道。You can add recipients and channels to this communication by creating Activity Log alerts. 有关详细信息,请参阅使用 Azure 活动日志监视订阅活动For more information, see Monitor subscription activity with the Azure Activity Log.

  1. 登录到 Azure 门户Sign in to the Azure portal.
  2. 在左侧菜单中,选择“监视” 。In the left menu, select Monitor.
  3. 在“监视 - 警报(经典)”窗格中,选择“+添加活动日志警报” 。In the Monitor - Alerts (classic) pane, select +Add activity log alert.
  4. 在“添加活动日志警报”页面中,选择或输入要求的信息 。On the Add activity log alert page, select or enter the requested information. 在“条件”中,确保设置以下值 :In Criteria, make sure that you set the following values:
    • 事件类别:选择“服务运行状况” 。Event category: Select Service Health.
    • 服务:选择“虚拟机规模集和虚拟机” 。Services: Select Virtual Machine Scale Sets and Virtual Machines.
    • 类型:选择“计划内维护” 。Type: Select Planned maintenance.

要详细了解如何配置活动日志警报,请参阅创建活动日志警报To learn more about how to configure Activity Log alerts, see Create Activity Log alerts

从门户中开始维护虚拟机规模集Start maintenance on your virtual machine scale set from the portal

在虚拟机规模集概述中可查看更多与维护相关的详细信息。You can see more maintenance-related details in the overview of virtual machine scale sets. 如果至少一个虚拟机规模集中的 VM 包括在大量的计划内维护中,则将会在页面顶部附近添加新的通知功能区。If at least one VM in the virtual machine scale set is included in the planned maintenance wave, a new notification ribbon is added near the top of the page. 选择通知功能区,转到“维护”页面 。Select the notification ribbon to go to the Maintenance page.

在“维护”页面上可以查看受计划内维护影响的 VM 实例 。On the Maintenance page, you can see which VM instance is affected by the planned maintenance. 要启动维护,请选择受影响 VM 对应的复选框。To start maintenance, select the check box that corresponds to the affected VM. 然后选择“开始维护” 。Then, select Start maintenance.

开始维护后,虚拟机规模集中受影响的 VM 将接受维护并暂时不可用。After you start maintenance, the affected VMs in your virtual machine scale set undergo maintenance and are temporarily unavailable. 如果错过了自助时段,则由 Azure 维护虚拟机规模集时,你将仍可以看到该时段。If you missed the self-service window, you can still see the time window when your virtual machine scale set will be maintained by Azure.

使用 PowerShell 查看维护状态Check maintenance status by using PowerShell

可以使用 Azure PowerShell 查看虚拟机规模集中的 VM 计划何时维护。You can use Azure PowerShell to see when VMs in your virtual machine scale sets are scheduled for maintenance. 使用 -InstanceView 参数时可通过使用 Get-AzVmss cmdlet 获得计划内维护信息。Planned maintenance information is available by using the Get-AzVmss cmdlet when you use the -InstanceView parameter.

仅当有计划内维护时,才会返回维护信息。Maintenance information is returned only if maintenance is planned. 如果未计划影响 VM 实例的维护,则 cmdlet 不会返回任何维护信息。If no maintenance is scheduled that affects the VM instance, the cmdlet doesn't return any maintenance information.

Get-AzVmss -ResourceGroupName rgName -VMScaleSetName vmssName -InstanceId id -InstanceView

在 MaintenanceRedeployStatus 下返回以下属性 :The following properties are returned under MaintenanceRedeployStatus:

ValueValue 说明Description
IsCustomerInitiatedMaintenanceAllowedIsCustomerInitiatedMaintenanceAllowed 指示此时是否可以在 VM 上启动维护。Indicates whether you can start maintenance on the VM at this time.
PreMaintenanceWindowStartTimePreMaintenanceWindowStartTime 可以在 VM 上启动维护的自助式维护时段的起点。The beginning of the maintenance self-service window when you can initiate maintenance on your VM.
PreMaintenanceWindowEndTimePreMaintenanceWindowEndTime 可以在 VM 上启动维护的自助式维护时段的终点。The end of the maintenance self-service window when you can initiate maintenance on your VM.
MaintenanceWindowStartTimeMaintenanceWindowStartTime Azure 在 VM 上启动维护的计划内维护时段的起点。The beginning of the maintenance scheduled in which Azure initiates maintenance on your VM.
MaintenanceWindowEndTimeMaintenanceWindowEndTime Azure 在 VM 上启动维护的计划内维护时段的终点。The end of the maintenance scheduled window in which Azure initiates maintenance on your VM.
LastOperationResultCodeLastOperationResultCode 上次尝试在 VM 上启动维护的结果。The result of the last attempt to initiate maintenance on the VM.

使用 PowerShell 在 VM 实例上启动维护Start maintenance on your VM instance by using PowerShell

如果 IsCustomerInitiatedMaintenanceAllowed 设置为 true,则可以在 VM 上启动维护 。You can start maintenance on a VM if IsCustomerInitiatedMaintenanceAllowed is set to true. 使用含 -PerformMaintenance 参数的 Set-AzVmss cmdlet。Use the Set-AzVmss cmdlet with -PerformMaintenance parameter.

Set-AzVmss -ResourceGroupName rgName -VMScaleSetName vmssName -InstanceId id -PerformMaintenance 

使用 CLI 查看维护状态Check maintenance status by using the CLI

可以使用 az vmss list-instances 查看计划内维护信息。You can view planned maintenance information by using az vmss list-instances.

仅当有计划内维护时,才会返回维护信息。Maintenance information is returned only if maintenance is planned. 如果未计划影响 VM 实例的维护,则该命令不会返回任何维护信息。If no maintenance that affects the VM instance is scheduled, the command doesn't return any maintenance information.

az vmss list-instances -g rgName -n vmssName --expand instanceView

在每个 VM 实例的 MaintenanceRedeployStatus 下返回以下属性 :The following properties are returned under MaintenanceRedeployStatus for each VM instance:

ValueValue 说明Description
IsCustomerInitiatedMaintenanceAllowedIsCustomerInitiatedMaintenanceAllowed 指示此时是否可以在 VM 上启动维护。Indicates whether you can start maintenance on the VM at this time.
PreMaintenanceWindowStartTimePreMaintenanceWindowStartTime 可以在 VM 上启动维护的自助式维护时段的起点。The beginning of the maintenance self-service window when you can initiate maintenance on your VM.
PreMaintenanceWindowEndTimePreMaintenanceWindowEndTime 可以在 VM 上启动维护的自助式维护时段的终点。The end of the maintenance self-service window when you can initiate maintenance on your VM.
MaintenanceWindowStartTimeMaintenanceWindowStartTime Azure 在 VM 上启动维护的计划内维护时段的起点。The beginning of the maintenance scheduled in which Azure initiates maintenance on your VM.
MaintenanceWindowEndTimeMaintenanceWindowEndTime Azure 在 VM 上启动维护的计划内维护时段的终点。The end of the maintenance scheduled window in which Azure initiates maintenance on your VM.
LastOperationResultCodeLastOperationResultCode 上次尝试在 VM 上启动维护的结果。The result of the last attempt to initiate maintenance on the VM.

使用 CLI 在 VM 实例上启动维护Start maintenance on your VM instance by using the CLI

如果 IsCustomerInitiatedMaintenanceAllowed 设置为 true,以下调用会在 VM 实例上启动维护 :The following call initiates maintenance on a VM instance if IsCustomerInitiatedMaintenanceAllowed is set to true:

az vmss perform-maintenance -g rgName -n vmssName --instance-ids id

常见问题FAQ

问:为什么需要立即重新启动 VM?Q: Why do you need to reboot my VMs now?

答: 尽管 Azure 平台大多数更新和升级不会影响 VM 可用性,但是在某些情况下,无法避免重启 Azure 中托管的 VM。A: Although most updates and upgrades to the Azure platform don't affect VM availability, in some cases, we can't avoid rebooting VMs hosted in Azure. 我们累积了多个需要重启服务器的更改,这将导致重新启动 VM。We have accumulated several changes that require us to restart our servers that will result in VM reboot.

问:如果我按建议使用可用性集实现高可用性,我是否安全?Q: If I follow your recommendations for high availability by using an availability set, am I safe?

答: 可用性集或虚拟机规模集中部署的虚拟机使用更新域。A: Virtual machines deployed in an availability set or in virtual machine scale sets use update domains. 执行维护时,Azure 遵循更新域约束,不会从其他更新域(在同一可用性集中)重新启动 VM。When performing maintenance, Azure honors the update domain constraint and doesn't reboot VMs from a different update domain (within the same availability set). Azure 还会至少等待 30 分钟,然后才移到下一组 VM。Azure also waits for at least 30 minutes before moving to the next group of VMs.

有关高可用性的详细信息,请参阅 Azure 中虚拟机的区域和可用性For more information about high availability, see Regions and availability for virtual machines in Azure.

问:如何收到有关计划内维护的通知?Q: How can I be notified about planned maintenance?

答: 一次计划内维护是通过将计划设置到一个或多个 Azure 区域启动的。A: A planned maintenance wave starts by setting a schedule to one or more Azure regions. 不久之后,系统会将一个电子邮件通知发送到订阅管理员、共同管理员、所有者和参与者(每个订阅一封电子邮件)。Soon after, an email notification is sent to the subscription admins, co-admins, owners, and contributors (one email per subscription). 可以使用活动日志警报配置此通知的其他通道和收件人。Additional channels and recipients for this notification could be configured using Activity Log Alerts. 如果将虚拟机部署到已安排计划内维护的区域,你不会收到通知。In case you deploy a virtual machine to a region where planned maintenance is already scheduled, you will not receive the notification. 此时而应查看 VM 的维护状态。Instead, check the maintenance state of the VM.

问:我在门户、PowerShell 或 CLI 中看不到任何计划内维护的指示。这是为何?Q: I don't see any indication of planned maintenance in the portal, PowerShell, or the CLI. What's wrong?

答: 在计划内维护批次期间,与计划内维护相关的信息仅适用于受此计划内维护影响的 VM。A: Information related to planned maintenance is available during a planned maintenance wave only for the VMs that are affected by the planned maintenance. 如果未显示有数据,则维护批次可能已经结束(或尚未开始),或者 VM 可能已经托管在已更新的服务器上。If you don't see data, the maintenance wave might already be finished (or not started), or your VM might already be hosted on an updated server.

问:有什么方法可以知道 VM 受影响的确切时间?Q: Is there a way to know exactly when my VM will be affected?

答: 设置计划时,我们定义了一个长达几天的时间段。A: When we set the schedule, we define a time window of several days. 虽然服务器(和 VM)在此时段的确切重启顺序尚未可知,The exact sequencing of servers (and VMs) within this window is unknown. 如果要知道 VM 更新的确切时间,可使用计划事件If you want to know the exact time your VMs will be updated, you can use scheduled events. 如果使用计划事件,则可从 VM 内进行查询,并可在 VM 重启前 15 分钟内收到通知。When you use scheduled events, you can query from within the VM and receive a 15-minute notification before a VM reboot.

问:重启 VM 需要多长时间?Q: How long will it take you to reboot my VM?

答: 根据 VM 的大小,在自助维护时段内,重启最多可能需要几分钟时间。A: Depending on the size of your VM, reboot might take up to several minutes during the self-service maintenance window. 当 Azure 在计划性维护时段内启动重启时,重启通常需要 25 分钟左右。During the Azure-initiated reboots in the scheduled maintenance window, the reboot typically takes about 25 minutes. 如果使用云服务(Web/辅助角色)、虚拟机规模集或可用性集,则在计划性维护时段内每组 VM(更新域)之间有 30 分钟的可用时间。If you use Cloud Services (Web/Worker Role), virtual machine scale sets, or availability sets, you are given 30 minutes between each group of VMs (update domain) during the scheduled maintenance window.

问:我在 VM 上看不到任何维护信息,是哪里出错了?Q: I don’t see any maintenance information on my VMs. What went wrong?

答: 有很多原因会导致在 VM 上看不到任何维护信息:A: There are several reasons why you might not see any maintenance information on your VMs:

  • 使用的是标记为“Microsoft 内部”的订阅 。You are using a subscription marked as Microsoft Internal.
  • VM 未计划进行维护。Your VMs aren't scheduled for maintenance. 可能是此次维护已结束、已取消或已修改,因此你的 VM 不再受其影响。It might be that the maintenance wave ended, was canceled, or was modified so that your VMs are no longer affected by it.
  • 未将“维护”列添加到 VM 列表视图。 You don’t have the Maintenance column added to your VM list view. 虽然我们已向默认视图添加此列,但如果你将视图配置为查看非默认列,则仍需手动将“维护”列添加到 VM 列表视图 。Although we added this column to the default view, if you configure your view to see non-default columns, you must manually add the Maintenance column to your VM list view.

问:我的 VM 已计划进行第二次维护,为什么?Q: My VM is scheduled for maintenance for the second time. Why?

答: 在多种用例下,在维护和重新部署已经完成后,会对 VM 计划维护:A: In several use cases, your VM is scheduled for maintenance after you have already completed your maintenance and redeployed:

  • 我们已取消这次维护,并使用不同的有效负载重新启动它。We have canceled the maintenance wave and restarted it with a different payload. 可能是我们已检测到出错的有效负载,只需部署其他有效负载。It might be that we've detected a faulted payload and we simply need to deploy an additional payload.
  • 由于硬件故障,已在另一个节点上对 VM 进行服务修复。 Your VM was service healed to another node due to a hardware fault.
  • 选择了停止(解除分配)VM 并将其重启。You have selected to stop (deallocate) and restart the VM.
  • 已经为 VM 启用了 自动关闭You have auto shutdown turned on for the VM.

后续步骤Next steps

了解如何使用计划事件从 VM 内注册维护事件。Learn how to register for maintenance events from within the VM by using scheduled events.