Azure 中 Linux VM 的时间同步Time sync for Linux VMs in Azure

时间同步对于安全性和事件相关性来说很重要。Time sync is important for security and event correlation. 有时候,它用于分布式事务实现。Sometimes it is used for distributed transactions implementation. 多个计算机系统之间的时间准确性通过同步来实现。Time accuracy between multiple computer systems is achieved through synchronization. 同步可能受多种因素影响,包括重启以及时间源和提取时间的计算机之间的网络流量。Synchronization can be affected by multiple things, including reboots and network traffic between the time source and the computer fetching the time.

Azure 受运行 Windows Server 2016 的基础设施的支持。Azure is backed by infrastructure running Windows Server 2016. Windows Server 2016 已改进用于纠正时间和条件的算法,方便本地时钟与 UTC 同步。Windows Server 2016 has improved algorithms used to correct time and condition the local clock to synchronize with UTC. Windows Server 2016 的准确时间功能大大改进了 VMICTimeSync 服务的控制方式,可以通过控制 VM 与主机的同步来确保时间准确。The Windows Server 2016 Accurate Time feature greatly improved how the VMICTimeSync service that governs VMs with the host for accurate time. 改进包括增强 VM 启动或 VM 还原的初始时间的准确性,以及纠正中断延迟。Improvements include more accurate initial time on VM start or VM restore and interrupt latency correction.

备注

有关详细信息,请参阅 Windows Server 2016 的准确时间For more information, see Accurate time for Windows Server 2016.

概述Overview

计算机时钟的准确性根据计算机时钟与协调世界时 (UTC) 时间标准的接近程度来测量。Accuracy for a computer clock is gauged on how close the computer clock is to the Coordinated Universal Time (UTC) time standard. UTC 通过精确原子钟的跨国样本来定义,此类原子钟 300 年的偏差只有 1 秒。UTC is defined by a multinational sample of precise atomic clocks that can only be off by one second in 300 years. 但是,直接读取 UTC 需要专用硬件。But, reading UTC directly requires specialized hardware. 而时间服务器与 UTC 同步,可以从其他计算机访问,因此具备可伸缩性和可靠性。Instead, time servers are synced to UTC and are accessed from other computers to provide scalability and robustness. 每个计算机都有时间同步服务运行,该服务知道使用什么时间服务器,并定期检查计算机时钟是否需纠正,然后根据需要调整时间。Every computer has time synchronization service running that knows what time servers to use and periodically checks if computer clock needs to be corrected and adjusts time if needed.

Azure 主机与内部 Azure 时间服务器同步,后者从 Azure 拥有的带 GPS 天线的第 1 层设备获取其时间。Azure hosts are synchronized to internal Azure time servers that take their time from Azure-owned Stratum 1 devices, with GPS antennas. Azure 中的虚拟机可以依赖其主机来获取准确的时间(主机时间),也可以直接从时间服务器获取时间,或者同时采用这两种方法。Virtual machines in Azure can either depend on their host to pass the accurate time (host time) on to the VM or the VM can directly get time from a time server, or a combination of both.

在独立硬件上,Linux OS 仅在启动时读取主机硬件时钟数据。On stand-alone hardware, the Linux OS only reads the host hardware clock on boot. 然后,时钟会通过 Linux 内核中的中断计时器来维护。After that, the clock is maintained using the interrupt timer in the Linux kernel. 在此配置中,时钟会随着时间的推移而出现偏差。In this configuration, the clock will drift over time. 在 Azure 上的较新的 Linux 发行版中,VM 可以使用 Linux Integration Services (LIS) 中随附的 VMICTimeSync 提供程序,从主机更频繁地查询时钟更新。In newer Linux distributions on Azure, VMs can use the VMICTimeSync provider, included in the Linux integration services (LIS), to query for clock updates from the host more frequently.

虚拟机与主机的交互也可能影响时钟。Virtual machine interactions with the host can also affect the clock. 内存保留维护期间,VM 会暂停最多 30 秒的时间。During memory preserving maintenance, VMs are paused for up to 30 seconds. 例如,在维护开始之前,VM 时钟显示上午 10:00:00,这种状态会持续 28 秒。For example, before maintenance begins the VM clock shows 10:00:00 AM and lasts 28 seconds. 在 VM 恢复后,VM 上的时钟仍显示上午 10:00:00,这样就造成 28 秒的偏差。After the VM resumes, the clock on the VM would still show 10:00:00 AM, which would be 28 seconds off. 为了进行纠正,VMICTimeSync 服务会监视主机上发生的情况,并会提示用户在 VM 上进行更改以纠正时间偏差。To correct for this, the VMICTimeSync service monitors what is happening on the host and prompts for changes to happen on the VMs to compensate.

如果不进行时间同步,VM 上的时钟会累积错误。Without time synchronization working, the clock on the VM would accumulate errors. 只有一个 VM 时,效果可能不明显,除非工作负荷要求极为准确的计时。When there is only one VM, the effect might not be significant unless the workload requires highly accurate timekeeping. 但在大多数情况下,我们有多个互连的 VM,这些 VM 使用时间来跟踪事务,因此需确保整个部署的时间一致。But in most cases, we have multiple, interconnected VMs that use time to track transactions and the time needs to be consistent throughout the entire deployment. 当 VM 之间的时间不同时,可能会造成以下影响:When time between VMs is different, you could see the following effects:

  • 身份验证会失败。Authentication will fail. 安全协议(如 Kerberos)或依赖于证书的技术要求跨系统确保时间一致性。Security protocols like Kerberos or certificate-dependent technology rely on time being consistent across the systems.
  • 如果日志(或其他数据)在时间上不一致,则很难弄清楚系统中发生了什么。It's very hard to figure out what have happened in a system if logs (or other data) don't agree on time. 同一事件看起来就像是在不同的时间发生,难以进行关联。The same event would look like it occurred at different times, making correlation difficult.
  • 如果时钟存在偏差,则可能造成计费不正确。If clock is off, the billing could be calculated incorrectly.

配置选项Configuration options

一般情况下,可以通过三种方式为托管在 Azure 中的 Linux VM 配置时间同步:There are generally three ways to configure time sync for your Linux VMs hosted in Azure:

  • Azure 市场映像的默认配置使用 NTP 时间和 VMICTimeSync 主机时间。The default configuration for Azure Marketplace images uses both NTP time and VMICTimeSync host-time.
  • 仅主机(使用 VMICTimeSync)。Host-only using VMICTimeSync.
  • 在使用或不使用 VMICTimeSync 主机时间的情况下,使用另一外部时间服务器。Use another, external time server with or without using VMICTimeSync host-time.

使用默认值Use the default

默认情况下,大多数适用于 Linux 的 Azure 市场映像配置为与两个源同步:By default, most Azure Marketplace images for Linux are configured to sync from two sources:

  • NTP 充当主要源,可以从 NTP 服务器获取时间。NTP as primary, which gets time from an NTP server. 例如,Ubuntu 16.04 LTS 市场映像使用 ntp.ubuntu.comFor example, Ubuntu 16.04 LTS Marketplace images use ntp.ubuntu.com.

  • VMICTimeSync 服务充当次要源,用于将主机时间传递给 VM,并在 VM 因维护而暂停后进行纠正。The VMICTimeSync service as secondary, used to communicate the host time to the VMs and make corrections after the VM is paused for maintenance. Azure 主机使用 Azure 拥有的第 1 层设备来确保时间的准确性。Azure hosts use Azure-owned Stratum 1 devices to keep accurate time.

在较新的 Linux 发行版中,VMICTimeSync 服务提供精度时间协议 (PTP) 硬件时钟源,但较早的发行版可能不提供此时钟源,因此会回退到 NTP 从主机获取时间。In newer Linux distributions, the VMICTimeSync service provides a Precision Time Protocol (PTP) hardware clock source, but earlier distributions may not provide this clock source and will fall-back to NTP for getting time from the host.

若要确认 NTP 是否正确同步,请运行 ntpq -p 命令。To confirm NTP is synchronizing correctly, run the ntpq -p command.

仅主机Host-only

由于 NTP 服务器(例如 time.windows.com 和 ntp.ubuntu.com)是公共的,因此与其同步时间需要通过 Internet 发送流量。Because NTP servers like time.windows.com and ntp.ubuntu.com are public, syncing time with them requires sending traffic over the internet. 数据包的延迟各不相同,可能会对时间同步的质量造成负面影响。通过切换到“仅主机”同步来删除 NTP 有时候可以改善时间同步结果。Varying packet delays can negatively affect quality of the time sync. Removing NTP by switching to host-only sync can sometimes improve your time sync results.

如果在使用默认配置时遇到时间同步问题,则可切换到“仅主机”时间同步。Switching to host-only time sync makes sense if you experience time sync issues using the default configuration. 尝试“仅主机”同步,看是否会改进 VM 上的时间同步。Try out the host-only sync to see if that would improve the time sync on your VM.

外部时间服务器External time server

如果有特定的时间同步要求,则可使用另一选项,即,使用外部时间服务器。If you have specific time sync requirements, there is also an option of using external time servers. 外部时间服务器可以提供特定的时间,这可以用于测试方案,确保时间在非 Microsoft 数据中心托管的计算机中的一致性,或者以特殊方式来处理闰秒问题。External time servers can provide specific time, which can be useful for test scenarios, ensuring time uniformity with machines hosted in non-Microsoft datacenters, or handling leap seconds in a special way.

可以将外部时间服务器与 VMICTimeSync 服务组合使用,提供类似于默认配置的结果。You can combine an external time server with the VMICTimeSync service to provide results similar to the default configuration. 若要处理因维护而暂停 VM 所导致的问题,最好是将外部时间服务器与 VMICTimeSync 组合使用。Combining an external time server with VMICTimeSync is the best option for dealing with issues that can be cause when VMs are paused for maintenance.

工具和资源Tools and resources

可以通过一些基本的命令来检查时间同步配置。There are some basic commands for checking your time synchronization configuration. Linux 发行版的文档更详细地说明了如何才能以最佳方式为该发行版配置时间同步。Documentation for Linux distribution will have more details on the best way to configure time synchronization for that distribution.

集成服务Integration services

查看集成服务 (hv_utils) 是否已加载。Check to see if the integration service (hv_utils) is loaded.

lsmod | grep hv_utils

看到的内容应该如下所示:You should see something similar to this:

hv_utils               24418  0
hv_vmbus              397185  7 hv_balloon,hyperv_keyboard,hv_netvsc,hid_hyperv,hv_utils,hyperv_fb,hv_storvsc

查看 Hyper-V 集成服务守护程序是否正在运行。See if the Hyper-V integration services daemon is running.

ps -ef | grep hv

看到的内容应该如下所示:You should see something similar to this:

root        229      2  0 17:52 ?        00:00:00 [hv_vmbus_con]
root        391      2  0 17:52 ?        00:00:00 [hv_balloon]

检查 PTP 时钟源Check for PTP Clock Source

使用较新版的 Linux 时,可以在 VMICTimeSync 提供程序中获得精度时间协议 (PTP) 时钟源。With newer versions of Linux, a Precision Time Protocol (PTP) clock source is available as part of the VMICTimeSync provider. 在较旧版的 CentOS 7.x 上,Linux Integration Services 可以在下载后用于安装更新的驱动程序。On older versions of CentOS 7.x the Linux Integration Services can be downloaded and used to install the updated driver. 当 PTP 时钟源可用时,Linux 设备的表示形式为 /dev/ptpx。When the PTP clock source is available, the Linux device will be of the form /dev/ptpx.

查看哪些 PTP 时钟源可用。See which PTP clock sources are available.

ls /sys/class/ptp

在此示例中,返回的值为 ptp0,因此我们使用它来检查时钟名称。In this example, the value returned is ptp0, so we use that to check the clock name. 若要验证设备,请检查时钟名称。To verify the device, check the clock name.

cat /sys/class/ptp/ptp0/clock_name

此命令应返回 hypervThis should return hyperv.

chronychrony

在 Ubuntu 19.10 及更高版本和 CentOS 8.x 上,chrony 配置为使用 PTP 源时钟。On Ubuntu 19.10 and later versions, and CentOS 8.x, chrony is configured to use a PTP source clock. 旧的 Linux 发行版使用网络时间协议守护程序 (ntpd)(不支持 PTP 源),而不是使用 chrony。Instead of chrony, older Linux releases use the Network Time Protocol daemon (ntpd), which doesn't support PTP sources. 要在这些版本中启用 PTP,必须使用以下代码手动安装并配置 chrony(在 chrony.conf 中):To enable PTP in those releases, chrony must be manually installed and configured (in chrony.conf) by using the following code:

refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

有关 Ubuntu 和 NTP 的详细信息,请参阅时间同步For more information about Ubuntu and NTP, see Time Synchronization.

有关 chrony 的详细信息,请参阅使用 chronyFor more information about chrony, see Using chrony.

如果同时启用了 chrony 和 VMICTimeSync 源,则可将一个源标记为“首选”,这样就会将另一个源设置为“备用”。If both chrony and VMICTimeSync sources are enabled simultaneously, you can mark one as prefer, which sets the other source as a backup. 由于 NTP 服务在遇到大偏差时更新时钟需要很长时间,因此可以使用 VMICTimeSync,后者在出现 VM 暂停事件时恢复时钟的速度远远快于单独使用基于 NTP 的工具的速度。Because NTP services do not update the clock for large skews except after a long period, the VMICTimeSync will recover the clock from paused VM events far more quickly than NTP-based tools alone.

默认情况下,chronyd 会加快或减慢系统时钟以修复任何时间偏移。By default, chronyd accelerates or slows the system clock to fix any time drift. 如果偏移量过大,chrony 将无法修复偏移。If the drift becomes too big, chrony fails to fix the drift. 若要解决此问题,可以更改 /etc/chrony.conf 中的 makestep 参数,以便在偏移量超过指定的阈值时强制进行时间同步。To overcome this, the makestep parameter in /etc/chrony.conf can be changed to force a time sync if the drift exceeds the threshold specified.

makestep 1.0 -1

在这里,如果偏移量大于 1 秒,chrony 会强制进行时间更新。Here, chrony will force a time update if the drift is greater than 1 second. 若要应用更改,请重启 chronyd 服务:To apply the changes restart the chronyd service:

systemctl restart chronyd

systemdsystemd

在 19.10 以前的 SUSE 和 Ubuntu 版本中,使用 systemd 配置时间同步。On SUSE and Ubuntu releases before 19.10, time sync is configured using systemd. 有关 Ubuntu 的详细信息,请参阅时间同步For more information about Ubuntu, see Time Synchronization. 有关 SUSE 的详细信息,请参阅 SUSE Linux Enterprise Server 12 SP3 发行说明中的第 4.5.8 节。For more information about SUSE, see Section 4.5.8 in SUSE Linux Enterprise Server 12 SP3 Release Notes.

后续步骤Next steps

有关详细信息,请参阅 Windows Server 2016 的准确时间For more information, see Accurate time for Windows Server 2016.