创建不含预配代理的通用映像Creating generalized images without a provisioning agent

Azure 以 walinuxagentcloud-init 形式(推荐)提供适用于 Linux VM 的预配代理。Azure provides provisioning agents for Linux VMs in the form of the walinuxagent or cloud-init (recommended). 但在某些情况下,你不希望使用这些应用程序之一作为你的预配代理,例如:But there could be a scenario when you don't want to use either of these applications for your provisioning agent, such as:

  • 你的 Linux 发行版/版本不支持 cloud-init/Linux 代理。Your Linux distro/version does not support cloud-init/Linux Agent.
  • 你需要设置特定的 VM 属性,例如主机名。You require specific VM properties to be set, such as hostname.

备注

如果你不需要设置任何属性,也不需要进行任何形式的预配,则应考虑创建专用映像。If you do not require any properties to be set or any form of provisioning to happen you should consider creating a specialized image.

本文介绍了在不安装预配代理的情况下如何设置 VM 映像来满足 Azure 平台要求并设置主机名。This article shows how you can setup your VM image to satisfy the Azure platform requirements and set the hostname, without installing a provisioning agent.

联网并报告就绪状态Networking and reporting ready

为了使 Linux VM 与 Azure 组件通信,你需要使用 DHCP 客户端从虚拟网络中检索主机 IP,并且还需要 DNS 解析和路由管理。In order to have your Linux VM communicating with Azure components, you will require a DHCP client to retrieve a host IP from the virtual network, as well as DNS resolution and route management. 大多数发行版都附带了这些实用工具。Most distros ship with these utilities out-of-the-box. Linux 发行版供应商已在 Azure 上进行了测试的工具包括 dhclientnetwork-managersystemd-networkd,等等。Tools that have been tested on Azure by Linux distro vendors include dhclient, network-manager, systemd-networkd and others.

备注

当前仅启用了 DHCP 的 VM 支持创建不含预配代理的通用映像。Currently creating generalized images without a provisioning agent only supports DHCP-enabled VMs.

设置并配置网络后,必须“报告就绪状态”。After networking has been setup and configured, you must "report ready". 这将告诉 Azure 该 VM 已成功预配。This will tell Azure that the VM has been successfully provisioning.

重要

不能向 Azure 报告就绪状态将导致 VM 重启!Failing to report ready to Azure will result in your VM being rebooted!

演示/示例Demo/sample

此演示将展示如何获取现有市场映像(在本例中为 Debian Buster VM)并删除 Linux 代理 (walinuxagent),同时创建最基本的流程来向 Azure 报告 VM 已“就绪”。This demo will show how you can take an existing Marketplace image (in this case, a Debian Buster VM) and remove the Linux Agent (walinuxagent), but also creating the most basic process to report to Azure that the VM is "ready".

创建资源组和基本 VM:Create the resource group and base VM:

$ az group create --location chinanorth --name demo1

创建基本 VM:Create the base VM:

$ az vm create \
    --resource-group demo1 \
    --name demo1 \
    --location chinanorth \
    --ssh-key-value <ssh_pub_key_path> \
    --public-ip-address-dns-name demo1 \
    --image "debian:debian-10:10:latest"

删除映像预配代理Remove the image provisioning Agent

预配 VM 后,可以通过 SSH 连接到该 VM 并删除 Linux 代理:Once the VM is provisioning, you can SSH into it and remove the Linux Agent:

$ sudo apt purge -y waagent
$ sudo rm -rf /var/lib/waagent /etc/waagent.conf /var/log/waagent.log

向 VM 添加所需的代码Add required code to the VM

还是在 VM 内,因为我们已删除了 Azure Linux 代理,所以我们需要提供一种机制来报告就绪状态。Also inside the VM, because we've removed the Azure Linux Agent we need to provide a mechanism to report ready.

Python 脚本Python script

import http.client
import sys
from xml.etree import ElementTree

wireserver_ip = '168.63.129.16'
wireserver_conn = http.client.HTTPConnection(wireserver_ip)

print('Retrieving goal state from the Wireserver')
wireserver_conn.request(
    'GET',
    '/machine?comp=goalstate',
    headers={'x-ms-version': '2012-11-30'}
)

resp = wireserver_conn.getresponse()

if resp.status != 200:
    print('Unable to connect with wireserver')
    sys.exit(1)

wireserver_goalstate = resp.read().decode('utf-8')

xml_el = ElementTree.fromstring(wireserver_goalstate)

container_id = xml_el.findtext('Container/ContainerId')
instance_id = xml_el.findtext('Container/RoleInstanceList/RoleInstance/InstanceId')
print(f'ContainerId: {container_id}')
print(f'InstanceId: {instance_id}')

# Construct the XML response we need to send to Wireserver to report ready.
health = ElementTree.Element('Health')
goalstate_incarnation = ElementTree.SubElement(health, 'GoalStateIncarnation')
goalstate_incarnation.text = '1'
container = ElementTree.SubElement(health, 'Container')
container_id_el = ElementTree.SubElement(container, 'ContainerId')
container_id_el.text = container_id
role_instance_list = ElementTree.SubElement(container, 'RoleInstanceList')
role = ElementTree.SubElement(role_instance_list, 'Role')
instance_id_el = ElementTree.SubElement(role, 'InstanceId')
instance_id_el.text = instance_id
health_second = ElementTree.SubElement(role, 'Health')
state = ElementTree.SubElement(health_second, 'State')
state.text = 'Ready'

out_xml = ElementTree.tostring(
    health,
    encoding='unicode',
    method='xml'
)
print('Sending the following data to Wireserver:')
print(out_xml)

wireserver_conn.request(
    'POST',
    '/machine?comp=health',
    headers={
        'x-ms-version': '2012-11-30',
        'Content-Type': 'text/xml;charset=utf-8',
        'x-ms-agent-name': 'custom-provisioning'
    },
    body=out_xml
)

resp = wireserver_conn.getresponse()
print(f'Response: {resp.status} {resp.reason}')

wireserver_conn.close()

一般步骤(不使用 Python)Generic steps (without using Python)

如果 VM 未安装或未提供 Python,则可通过以下步骤以编程方式重现以上脚本逻辑:If your VM doesn't have Python installed or available, you can programmatically reproduce this above script logic with the following steps:

  1. 通过分析来自 WireServer 的响应检索 ContainerIdInstanceIdcurl -X GET -H 'x-ms-version: 2012-11-30' http://168.63.129.16/machine?comp=goalstateRetrieve the ContainerId and InstanceId by parsing the response from the WireServer: curl -X GET -H 'x-ms-version: 2012-11-30' http://168.63.129.16/machine?comp=goalstate.

  2. 构造以下 XML 数据,注入在上面的步骤分析的 ContainerIdInstanceIdConstruct the following XML data, injecting the parsed ContainerId and InstanceId from the above step:

    <Health>
      <GoalStateIncarnation>1</GoalStateIncarnation>
      <Container>
        <ContainerId>CONTAINER_ID</ContainerId>
        <RoleInstanceList>
          <Role>
            <InstanceId>INSTANCE_ID</InstanceId>
            <Health>
              <State>Ready</State>
            </Health>
          </Role>
        </RoleInstanceList>
      </Container>
    </Health>
    
  3. 将此数据发布到 WireServer:curl -X POST -H 'x-ms-version: 2012-11-30' -H "x-ms-agent-name: WALinuxAgent" -H "Content-Type: text/xml;charset=utf-8" -d "$REPORT_READY_XML" http://168.63.129.16/machine?comp=healthPost this data to WireServer: curl -X POST -H 'x-ms-version: 2012-11-30' -H "x-ms-agent-name: WALinuxAgent" -H "Content-Type: text/xml;charset=utf-8" -d "$REPORT_READY_XML" http://168.63.129.16/machine?comp=health

首次启动时自动运行代码Automating running the code at first boot

此演示使用 systemd,这是新式 Linux 发行版中最常见的初始化系统。This demo uses systemd, which is the most common init system in modern Linux distros. 因此,要确保报告就绪状态的此机制在正确的时间运行,最简单且最原始的方法是创建一个 systemd 服务单元。So the easiest and most native way to ensure this report ready mechanism runs at the right time is to create a systemd service unit. 你可以将以下单元文件添加到 /etc/systemd/system(此示例将单元文件命名为 azure-provisioning.service):You can add the following unit file to /etc/systemd/system (this example names the unit file azure-provisioning.service):

[Unit]
Description=Azure Provisioning

[Service]
Type=oneshot
ExecStart=/usr/bin/python3 /usr/local/azure-provisioning.py
ExecStart=/bin/bash -c "hostnamectl set-hostname $(curl \
    -H 'metadata: true' \
    'http://169.254.169.254/metadata/instance/compute/name?api-version=2019-06-01&format=text')"
ExecStart=/usr/bin/systemctl disable azure-provisioning.service

[Install]
WantedBy=multi-user.target

此 systemd 服务在基本预配过程中执行以下三项操作:This systemd service does three things for basic provisioning:

  1. 向 Azure 报告就绪状态(指示它已成功启动)。Reports ready to Azure (to indicate that it came up successfully).
  2. 基于用户提供的 VM 名称(从 IMDS 拉取此数据)重命名 VM。Renames the VM based off of the user-supplied VM name by pulling this data from IMDS.
  3. 禁用此服务自身,以使其仅在首次启动时运行,在后续启动时不运行。Disables itself so that it only runs on first boot and not on subsequent reboots.

将此单元添加到文件系统后,运行以下命令来启用它:With the unit on the filesystem, run the following to enable it:

$ sudo systemctl enable azure-provisioning.service

现在 VM 已准备好进行通用化,并基于它创建了一个映像。Now the VM is ready to be generalized and have an image created from it.

完成映像准备工作Completing the preparation of the image

返回到开发计算机,运行以下命令,准备通过基本 VM 创建映像:Back on your development machine, run the following to prepare for image creation from the base VM:

$ az vm deallocate --resource-group demo1 --name demo1
$ az vm generalize --resource-group demo1 --name demo1

基于此 VM 创建映像:And create the image from this VM:

$ az image create \
    --resource-group demo1 \
    --source demo1 \
    --location chinanorth \
    --name demo1img

现在,我们已准备好基于映像创建一个新的 VM(或多个 VM):Now we are ready to create a new VM (or multiple VMs) from the image:

$ IMAGE_ID=$(az image show -g demo1 -n demo1img --query id -o tsv)
$ az vm create \
    --resource-group demo12 \
    --name demo12 \
    --location chinanorth \
    --ssh-key-value <ssh_pub_key_path> \
    --public-ip-address-dns-name demo12 \
    --image "$IMAGE_ID" 
    --enable-agent false

备注

--enable-agent 设置为 false 非常重要,因为要基于映像创建的这个 VM 上不存在 walinuxagent。It is important to set --enable-agent to false because walinuxagent doesn't exist on this VM that is going to be created from the image.

此 VM 应当会成功预配。This VM should provisioning successfully. 登录到新预配的 VM 后,应该能够看到报告就绪状态的 systemd 服务的输出:Logging into the newly-provisioning VM, you should be able to see the output of the report ready systemd service:

$ sudo journalctl -u azure-provisioning.service
-- Logs begin at Thu 2020-06-11 20:28:45 UTC, end at Thu 2020-06-11 20:31:24 UTC. --
Jun 11 20:28:49 thstringnopa systemd[1]: Starting Azure Provisioning...
Jun 11 20:28:54 thstringnopa python3[320]: Retrieving goal state from the Wireserver
Jun 11 20:28:54 thstringnopa python3[320]: ContainerId: 7b324f53-983a-43bc-b919-1775d6077608
Jun 11 20:28:54 thstringnopa python3[320]: InstanceId: fbb84507-46cd-4f4e-bd78-a2edaa9d059b._thstringnopa2
Jun 11 20:28:54 thstringnopa python3[320]: Sending the following data to Wireserver:
Jun 11 20:28:54 thstringnopa python3[320]: <Health><GoalStateIncarnation>1</GoalStateIncarnation><Container><ContainerId>7b324f53-983a-43bc-b919-1775d6077608</ContainerId><RoleInstanceList><Role><InstanceId>fbb84507-46cd-4f4e-bd78-a2edaa9d059b._thstringnopa2</InstanceId><Health><State>Ready</State></Health></Role></RoleInstanceList></Container></Health>
Jun 11 20:28:54 thstringnopa python3[320]: Response: 200 OK
Jun 11 20:28:56 thstringnopa bash[472]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Jun 11 20:28:56 thstringnopa bash[472]:                                  Dload  Upload   Total   Spent    Left  Speed
Jun 11 20:28:56 thstringnopa bash[472]: [158B blob data]
Jun 11 20:28:56 thstringnopa2 systemctl[475]: Removed /etc/systemd/system/multi-user.target.wants/azure-provisioning.service.
Jun 11 20:28:56 thstringnopa2 systemd[1]: azure-provisioning.service: Succeeded.
Jun 11 20:28:56 thstringnopa2 systemd[1]: Started Azure Provisioning.

支持Support

如果你实现自己的预配代码/代理,你将负责支持此代码,Microsoft 支持人员将仅调查与预配接口不可用相关的问题。If you implement your own provisioning code/agent, then you own the support of this code, Microsoft support will only investigate issues relating to the provisioning interfaces not being available. 我们在不断改进和更改此领域,因此你必须监视 cloud-init 中的更改和 Azure Linux 代理中的预配 API 更改。We are continually making improvements and changes in this area, so you must monitor for changes in cloud-init and Azure Linux Agent for provisioning API changes.

后续步骤Next steps

有关详细信息,请参阅 Linux 预配For more information, see Linux provisioning.