Azure 元数据服务:适用于 Linux VM 的计划事件Azure Metadata Service: Scheduled Events for Linux VMs
计划事件是一个 Azure 元数据服务,可提供应用程序时间用于准备虚拟机 (VM) 维护。Scheduled Events is an Azure Metadata Service that gives your application time to prepare for virtual machine (VM) maintenance. 它提供有关即将发生的维护事件的信息(例如重新启动),使应用程序可以为其准备并限制中断。It provides information about upcoming maintenance events (for example, reboot) so that your application can prepare for them and limit disruption. 它可用于 Windows 和 Linux 上的所有 Azure 虚拟机类型(包括 PaaS 和 IaaS)。It's available for all Azure Virtual Machines types, including PaaS and IaaS on both Windows and Linux.
有关 Windows 上的计划事件的信息,请参阅适用于 Windows VM 的计划事件。For information about Scheduled Events on Windows, see Scheduled Events for Windows VMs.
备注
计划事件在所有 Azure 区域中正式发布。Scheduled Events is generally available in all Azure Regions. 有关最新版本信息,请参阅版本和区域可用性。See Version and Region Availability for latest release information.
为何使用计划事件?Why use Scheduled Events?
许多应用程序都可以受益于时间来准备 VM 维护。Many applications can benefit from time to prepare for VM maintenance. 时间可以用于执行应用程序的特定任务的提高可用性、可靠性和可维护性,包括:The time can be used to perform application-specific tasks that improve availability, reliability, and serviceability, including:
- 检查点和还原。Checkpoint and restore.
- 连接清空。Connection draining.
- 主要副本故障转移。Primary replica failover.
- 从负载均衡器池删除。Removal from a load balancer pool.
- 事件日志记录。Event logging.
- 正常关闭。Graceful shutdown.
使用计划事件,应用程序可以发现维护的发生,并触发任务以限制其影响。With Scheduled Events, your application can discover when maintenance will occur and trigger tasks to limit its impact.
预定事件提供以下用例中的事件:Scheduled Events provides events in the following use cases:
- 平台启动的维护(例如,VM 重新启动、实时迁移或主机的内存保留更新)Platform initiated maintenance (for example, VM reboot, live migration or memory preserving updates for host)
- 虚拟机正在根据预测很快会出现故障的降级主机硬件上运行Virtual machine is running on degraded host hardware that is predicted to fail soon
- 用户启动的维护(例如,用户重启或重新部署 VM)User-initiated maintenance (for example, a user restarts or redeploys a VM)
基础知识The Basics
元数据服务公开在 VM 中使用可访问的 REST 终结点运行 VM 的相关信息。Metadata Service exposes information about running VMs by using a REST endpoint that's accessible from within the VM. 该信息通过不可路由的 IP 提供,因此不会在 VM 外部公开。The information is available via a nonroutable IP so that it's not exposed outside the VM.
作用域Scope
计划的事件传送到:Scheduled events are delivered to:
- 独立虚拟机。Standalone Virtual Machines.
- 云服务中的所有 VM。All the VMs in a cloud service.
- 可用性集中的所有 VM。All the VMs in an availability set.
- 规模集位置组中的所有 VM。All the VMs in a scale set placement group.
因此,检查事件中的 Resources
字段可确定哪些 VM 受到了影响。As a result, check the Resources
field in the event to identify which VMs are affected.
终结点发现Endpoint Discovery
对于启用了 VNET 的 VM,元数据服务可通过不可路由的静态 IP (169.254.169.254
) 使用。For VNET enabled VMs, Metadata Service is available from a static nonroutable IP, 169.254.169.254
. 最新版本的计划事件的完整终结点是:The full endpoint for the latest version of Scheduled Events is:
http://169.254.169.254/metadata/scheduledevents?api-version=2019-08-01
如果不是在虚拟网络中创建 VM(云服务和经典 VM 的默认情况),则需使用额外的逻辑以发现要使用的 IP 地址。If the VM is not created within a Virtual Network, the default cases for cloud services and classic VMs, additional logic is required to discover the IP address to use. 若要了解如何发现主机终结点,请参阅此示例。To learn how to discover the host endpoint, see this sample.
版本和区域可用性Version and Region Availability
计划事件服务受版本控制。The Scheduled Events service is versioned. 版本是必需的,当前版本为 2019-01-01
。Versions are mandatory; the current version is 2019-01-01
.
版本Version | 发布类型Release Type | 区域Regions | 发行说明Release Notes |
---|---|---|---|
2019-08-012019-08-01 | 正式版General Availability | 全部All | |
2019-04-012019-04-01 | 正式版General Availability | 全部All | |
2019-01-012019-01-01 | 正式版General Availability | 全部All | |
2017-08-012017-08-01 | 正式版General Availability | 全部All | |
2017-03-012017-03-01 | 预览Preview | 全部All |
备注
支持的计划事件的前一预览版 {latest} 发布为 api-version。Previous preview releases of Scheduled Events supported {latest} as the api-version. 此格式不再受支持,并且将在未来弃用。This format is no longer supported and will be deprecated in the future.
启用和禁用计划事件Enabling and Disabling Scheduled Events
首次为事件发出请求时,为服务启用了计划事件。Scheduled Events is enabled for your service the first time you make a request for events. 首次调用时应该会延迟响应最多两分钟。You should expect a delayed response in your first call of up to two minutes.
如果 24 小时未发出请求,将为服务禁用计划事件。Scheduled Events is disabled for your service if it does not make a request for 24 hours.
用户启动的维护User-initiated Maintenance
用户通过 Azure 门户、API、CLI 或 PowerShell 启动的 VM 维护会生成计划事件。User-initiated VM maintenance via the Azure portal, API, CLI, or PowerShell results in a scheduled event. 然后,可以在应用程序中测试维护准备逻辑,并可以通过应用程序准备用户启动的维护。You then can test the maintenance preparation logic in your application, and your application can prepare for user-initiated maintenance.
如果重启 VM,将计划 Reboot
类型的事件。If you restart a VM, an event with the type Reboot
is scheduled. 如果重新部署 VM,将计划 Redeploy
类型的事件。If you redeploy a VM, an event with the type Redeploy
is scheduled.
使用 APIUse the API
头文件Headers
查询元数据服务时,必须提供标头 Metadata:true
以确保不会在无意中重定向该请求。When you query Metadata Service, you must provide the header Metadata:true
to ensure the request wasn't unintentionally redirected. Metadata:true
标头对于所有预定事件请求是必需的。The Metadata:true
header is required for all scheduled events requests. 不在请求中包含标头会导致元数据服务发出的“错误的请求”响应。Failure to include the header in the request results in a "Bad Request" response from Metadata Service.
查询事件Query for events
只需进行以下调用即可查询计划事件:You can query for scheduled events by making the following call:
BashBash
curl -H Metadata:true http://169.254.169.254/metadata/scheduledevents?api-version=2019-08-01
响应包含计划事件的数组。A response contains an array of scheduled events. 数组为空意味着目前没有计划事件。An empty array means that currently no events are scheduled. 如果有计划事件,响应会包含事件的数组。In the case where there are scheduled events, the response contains an array of events.
{
"DocumentIncarnation": {IncarnationID},
"Events": [
{
"EventId": {eventID},
"EventType": "Reboot" | "Redeploy" | "Freeze" | "Terminate",
"ResourceType": "VirtualMachine",
"Resources": [{resourceName}],
"EventStatus": "Scheduled" | "Started",
"NotBefore": {timeInUTC},
"Description": {eventDescription},
"EventSource" : "Platform" | "User",
}
]
}
事件属性Event Properties
属性Property | 说明Description |
---|---|
EventIdEventId | 此事件的全局唯一标识符。Globally unique identifier for this event. 示例:Example:
|
EventTypeEventType | 此事件造成的影响。Impact this event causes. 值:Values:
|
ResourceTypeResourceType | 此事件影响的资源类型。Type of resource this event affects. 值:Values:
|
资源Resources | 此事件影响的资源列表。List of resources this event affects. 它保证最多只能包含一个更新域的计算机,但可能不包含该更新域中的所有计算机。The list is guaranteed to contain machines from at most one update domain, but it might not contain all machines in the UD. 示例:Example:
|
EventStatusEventStatus | 此事件的状态。Status of this event. 值:Values:
Completed 或类似状态。No Completed or similar status is ever provided. 事件完成后,将不再返回该事件。The event is no longer returned when the event is finished. |
NotBeforeNotBefore | 在可以启动此事件之前所要经过的时间。Time after which this event can start. 示例:Example:
|
说明Description | 此事件的说明。Description of this event. 示例:Example:
|
EventSourceEventSource | 事件的发起者。Initiator of the event. 示例:Example:
|
事件计划Event Scheduling
将根据事件类型为每个事件计划将来的最小量时间。Each event is scheduled a minimum amount of time in the future based on the event type. 此时间反映在某个事件的 NotBefore
属性上。This time is reflected in an event's NotBefore
property.
EventTypeEventType | 最小通知Minimum notice |
---|---|
冻结Freeze | 15 分钟15 minutes |
重新启动Reboot | 15 分钟15 minutes |
重新部署Redeploy | 10 分钟10 minutes |
终止Terminate | 用户可配置:5 - 15 分钟User Configurable: 5 to 15 minutes |
备注
在某些情况下,由于硬件降级,Azure 能够预测主机故障,并会尝试通过对迁移进行计划来缓解服务中断。In some cases, Azure is able to predict host failure due to degraded hardware and will attempt to mitigate disruption to your service by scheduling a migration. 受影响的虚拟机会收到计划事件,该事件的 NotBefore
通常是将来几天的时间。Affected virtual machines will receive a scheduled event with a NotBefore
that is typically a few days in the future. 实际时间因预测的故障风险评估而异。The actual time varies depending on the predicted failure risk assessment. Azure 会尽可能提前 7 天发出通知,但实际时间可能会有变化,如果预测硬件即将发生故障的可能性很大,则实际时间可能更早。Azure tries to give 7 days' advance notice when possible, but the actual time varies and might be smaller if the prediction is that there is a high chance of the hardware failing imminently. 为了在系统启动迁移之前硬件出现故障时将服务风险降至最低,我们建议你尽快自行重新部署虚拟机。To minimize risk to your service in case the hardware fails before the system-initiated migration, we recommend that you self-redeploy your virtual machine as soon as possible.
轮询频率Polling frequency
可根据需要频繁或偶尔轮询终结点以进行更新。You can poll the endpoint for updates as frequently or infrequently as you like. 但是,两次请求之间的时间越长,你拥有的对即将发生的事件做出响应的时间就越少。However, the longer the time between requests, the more time you potentially lose to react to an upcoming event. 大多数事件都会提前 5 到 15 分钟通知,尽管在某些情况下,可能只会提前 30 秒通知。Most events have 5 to 15 minutes of advance notice, although in some cases advance notice might be as little as 30 seconds. 为确保有尽可能多的时间采取缓解措施,我们建议你每秒轮询一次服务。To ensure that you have as much time as possible to take mitigating actions, we recommend that you poll the service once per second.
启动事件Start an event
了解即将发生的事件并完成正常关闭逻辑后,可以通过使用 EventId
对元数据服务进行 POST
调用来批准未完成的事件。After you learn of an upcoming event and finish your logic for graceful shutdown, you can approve the outstanding event by making a POST
call to Metadata Service with EventId
. 此调用指示 Azure 可以缩短最小通知时间(如可能)。This call indicates to Azure that it can shorten the minimum notification time (when possible).
下面是 POST
请求正文中所需的 JSON 示例。The following JSON sample is expected in the POST
request body. 请求应包含 StartRequests
列表。The request should contain a list of StartRequests
. 每个 StartRequest
包含想要加速的事件的 EventId
:Each StartRequest
contains EventId
for the event you want to expedite:
{
"StartRequests" : [
{
"EventId": {EventId}
}
]
}
Bash 示例Bash sample
curl -H Metadata:true -X POST -d '{"StartRequests": [{"EventId": "f020ba2e-3bc0-4c40-a10b-86575a9eabd5"}]}' http://169.254.169.254/metadata/scheduledevents?api-version=2019-01-01
备注
确认事件后,即可允许事件针对事件中所有的 Resources
继续进行,而不仅仅是确认该事件的 VM。Acknowledging an event allows the event to proceed for all Resources
in the event, not just the VM that acknowledges the event. 因此,可以选择一个指挥计算机来协调该确认,为简单起见,可选择 Resources
字段中的第一个计算机。Therefore, you can choose to elect a leader to coordinate the acknowledgement, which might be as simple as the first machine in the Resources
field.
Python 示例Python sample
下例将查询计划事件的元数据服务器并审核所有未完成的事件。The following sample queries Metadata Service for scheduled events and approves each outstanding event:
#!/usr/bin/python
import json
import socket
import urllib2
metadata_url = "http://169.254.169.254/metadata/scheduledevents?api-version=2019-08-01"
this_host = socket.gethostname()
def get_scheduled_events():
req = urllib2.Request(metadata_url)
req.add_header('Metadata', 'true')
resp = urllib2.urlopen(req)
data = json.loads(resp.read())
return data
def handle_scheduled_events(data):
for evt in data['Events']:
eventid = evt['EventId']
status = evt['EventStatus']
resources = evt['Resources']
eventtype = evt['EventType']
resourcetype = evt['ResourceType']
notbefore = evt['NotBefore'].replace(" ", "_")
description = evt['Description']
eventSource = evt['EventSource']
if this_host in resources:
print("+ Scheduled Event. This host " + this_host +
" is scheduled for " + eventtype +
" by " + eventSource +
" with description " + description +
" not before " + notbefore)
# Add logic for handling events here
def main():
data = get_scheduled_events()
handle_scheduled_events(data)
if __name__ == '__main__':
main()
后续步骤Next steps
- 在 Azure 实例元数据计划事件 GitHub 存储库中查看计划事件代码示例。Review the Scheduled Events code samples in the Azure Instance Metadata Scheduled Events GitHub repository.
- 详细了解实例元数据服务中提供的 API。Read more about the APIs that are available in the Instance Metadata Service.
- Azure 中 Linux 虚拟机的计划内维护。Learn about planned maintenance for Linux virtual machines in Azure.