规划 Azure 时序见解 Gen1 环境Plan your Azure Time Series Insights Gen1 environment

本文介绍如何根据预期入口速率和数据保留要求规划 Azure 时序见解 Gen1 环境。This article describes how to plan your Azure Time Series Insights Gen1 environment based on your expected ingress rate and your data retention requirements.

最佳实践Best practices

若要开始使用 Azure 时序见解,最好是知道每分钟想要推送的数据量以及需要存储数据的时间。To get started with Azure Time Series Insights, it's best if you know how much data you expect to push by the minute and how long you need to store your data.

有关 Azure 时序见解 SKU 的容量和保留期的详细信息,请阅读 Azure 时序见解定价For more information about capacity and retention for both Azure Time Series Insights SKUs, read Azure Time Series Insights pricing.

若要最合理地规划 Azure 时序见解环境以取得长期成功,请考虑以下属性:To best plan your Azure Time Series Insights environment for long-term success, consider the following attributes:

存储容量Storage capacity

默认情况下,Azure 时序见解根据预配的存储量(单位数乘以每个单位的存储量)和流入量来保留数据。By default, Azure Time Series Insights retains data based on the amount of storage you provision (units × the amount of storage per unit) and ingress.

数据保留Data retention

可以更改 Azure 时序见解环境中的“数据保留时间”设置。 You can change the Data retention time setting in your Azure Time Series Insights environment. 可以启用最长 400 天的保留期。You can enable up to 400 days of retention.

Azure 时序见解具有两种模式:Azure Time Series Insights has two modes:

  • 一种模式针对最新数据进行了优化。One mode optimizes for the most up-to-date data. 它强制执行清除旧数据的策略,使实例可以使用最新数据。It enforces a policy to Purge old data leaving recent data available with the instance. 此模式默认已启用。This mode is on, by default.
  • 其他模式将优化数据,使其保持低于配置的保留限制。The other optimizes data to remain below the configured retention limits. “暂停流入” 可防止新数据在被选为“超出存储限制时的行为” 时流入。Pause ingress prevents new data from being ingressed when it's selected as the Storage limit exceeded behavior.

可在 Azure 门户的环境配置页中调整保留期并在这两种模式之间切换。You can adjust retention and toggle between the two modes on the environment's configuration page in the Azure portal.

重要

可在 Azure 时序见解 Gen1 环境中配置最长 400 天的数据保留。You can configure a maximum of 400 days of data retention in your Azure Time Series Insights Gen1 environment.

配置数据保留Configure data retention

  1. Azure 门户中,选择时序见解环境。In the Azure portal, select your Time Series Insights environment.

  2. 在“时序见解环境”窗格中的“设置”下,选择“存储配置” 。In the Time Series Insights environment pane, under Settings, select Storage configuration.

  3. 在“数据保留时间(天)”框中,输入 1 到 400 的值 。In the Data retention time (in days) box, enter a value between 1 and 400.

    配置保留期Configure retention

提示

若要详细了解如何实施适当的数据保留策略,请阅读如何配置保留期To learn more about how to implement an appropriate data retention policy, read How to configure retention.

入口容量Ingress capacity

下面总结了正式版中的主要限制。The following summarizes key limits in General Availability.

SKU 入口速率和容量SKU ingress rates and capacities

S1 和 S2 SKU 入口速率和容量可在配置新的时序见解环境时提供灵活性。S1 and S2 SKU ingress rates and capacities provide flexibility when configuring a new Time Series Insights environment. SKU 容量基于所存储的事件数或字节数(以先达到容量上限者为准)指示每日入口速率。Your SKU capacity indicates your daily ingress rate based on number of events or bytes stored, whichever comes first. 请注意,流入量按分钟 进行度量,并使用令牌桶算法应用限制 。Note that ingress is measured per minute, and throttling is applied using the token bucket algorithm. 流入量以 1-KB 块为单位进行度量。Ingress is measured in 1-KB blocks. 例如,0.8 KB 的实际事件会度量为一个事件,2.6 KB 事件会度量为三个事件。For example a 0.8-KB actual event would be measured as one event, and a 2.6-KB event is counted as three events.

S1 SKU 容量S1 SKU capacity 入口速率Ingress rate 最大存储容量Maximum storage capacity
11 每天 1 GB(1 百万个事件)1 GB (1 million events) per day 每月 30 GB(3 千万个事件)30 GB (30 million events) per month
10 个10 每天 10 GB(1 千万个事件)10 GB (10 million events) per day 每月 300 GB(3 亿个事件)300 GB (300 million events) per month
S2 SKU 容量S2 SKU capacity 入口速率Ingress rate 最大存储容量Maximum storage capacity
11 每天 10 GB(1 千万个事件)10 GB (10 million events) per day 每月 300 GB(3 亿个事件)300 GB (300 million events) per month
10 个10 每天 100 GB(1 亿个事件)100 GB (100 million events) per day 每月 3 TB(30 亿个事件)3 TB (3 billion events) per month

备注

容量呈线性增长,因此容量为 2 的 S1 SKU 每日入口速率支持 2 GB(2 百万)的事件,每月支持 60 GB(6 千万)的事件。Capacities scale linearly, so an S1 SKU with capacity 2 supports 2 GB (2 million) events per day ingress rate and 60 GB (60 million events) per month.

S2 SKU 环境每月支持更多的事件,并具有显著更高的入口容量。S2 SKU environments support substantially more events per month and have a significantly higher ingress capacity.

SKUSKU 每月事件计数Event count per month 每分钟事件计数Event count per minute 每分钟事件大小Event size per minute
S1S1 3000 万30 million 720720 720 KB720 KB
S2S2 3 亿300 million 7,2007,200 7,200 KB7,200 KB

属性限制Property limits

GA 属性限制取决于所选的 SKU 环境。GA property limits depend on the SKU environment that's selected. 提供的事件属性具有相应的 JSON、CSV 和图表列,可以在 Azure 时序见解资源管理器中查看。Supplied event properties have corresponding JSON, CSV, and chart columns that can viewed within the Time Series Insights Explorer.

SKUSKU 最大属性数Maximum properties
S1S1 600 属性(列)600 properties (columns)
S2S2 800 属性(列)800 properties (columns)

事件源Event sources

每个实例最多支持两个事件源。A maximum of two event sources per instance is supported.

API 限制API limits

REST API 参考文档中指定了时序见解正式版的 REST API 限制。REST API limits for Time Series Insights General Availability are specified in the REST API reference documentation.

环境规划Environment planning

在规划 Azure 时序见解环境时,需要重点考虑的第二个方面是流入容量。The second area to focus on for planning your Azure Time Series Insights environment is ingress capacity. 每日入口存储和事件容量每分钟度量一次,以 1 KB 块为单位。The daily ingress storage and event capacity is measured per minute, in 1-KB blocks. 允许的最大数据包大小为 32 KB。The maximum allowed packet size is 32 KB. 大于 32 KB 的数据包将被截断。Data packets larger than 32 KB are truncated.

可以在单一环境中,将 S1 或 S2 SKU 的容量增加到 10 个单位。You can increase the capacity of an S1 or S2 SKU to 10 units in a single environment. 无法从 S1 环境迁移到 S2 环境。You can't migrate from an S1 environment to an S2. 无法从 S2 环境迁移到 S1 环境。You can't migrate from an S2 environment to an S1.

对于流入容量,首先应该确定每月所需的流入总量。For ingress capacity, first determine the total ingress you require on a per-month basis. 接下来,确定每分钟的需求。Next, determine what your per-minute needs are.

限制和延迟对每分钟容量的影响很大。Throttling and latency play a role in per-minute capacity. 如果数据流入的高峰期持续时间少于 24 小时,则 Azure 时序见解可以两倍于上表中所列的速度“赶上”流入速率。If you have a spike in your data ingress that lasts less than 24 hours, Azure Time Series Insights can "catch up" at an ingress rate of two times the rates listed in the preceding table.

例如,如果你使用单个 S1 SKU,流入数据的速率为每分钟 720 个事件,数据流入高峰的速率为 1,440 个事件(或更少)且持续时间不到 1 小时,则环境中不会出现明显的延迟。For example, if you have a single S1 SKU, you ingress data at a rate of 720 events per minute, and the data rate spikes for less than one hour at a rate of 1,440 events or less, there's no noticeable latency in your environment. 但是,如果速率大于每分钟 1,440 个事件且超过 1 小时,则可能会发生数据延迟,这可在环境中查看并查询。However, if you exceed 1,440 events per minute for more than one hour, you likely will experience latency in data that is visualized and available for query in your environment.

你可能无法提前知道想要推送多少数据。You might not know in advance how much data you expect to push. 在这种情况下,可以在 Azure 门户订阅的 Azure IoT 中心Azure 事件中心查找遥测数据。In this case, you can find data telemetry for Azure IoT Hub and Azure Event Hubs in your Azure portal subscription. 这些遥测数据有助于确定如何预配环境。The telemetry can help you determine how to provision your environment. 在 Azure 门户使用相应事件源的“指标”页查看遥测数据 。Use the Metrics pane in the Azure portal for the respective event source to view its telemetry. 如果你了解数据源指标,便可以更有效地规划和预配 Azure 时序见解环境。If you understand your event source metrics, you can more effectively plan and provision your Azure Time Series Insights environment.

计算入口需求Calculate ingress requirements

若要计算流入要求:To calculate your ingress requirements:

  • 确认流入容量高于平均每分钟的速率,并且环境足够大,能够在 1 小时内处理相当于两倍容量的预期流入量。Verify that your ingress capacity is above your average per-minute rate and that your environment is large enough to handle your anticipated ingress equivalent to two times your capacity for less than one hour.

  • 如果发生持续超过 1 小时的流入高峰,请使用高峰速率作为平均值。If ingress spikes occur that last for longer than 1 hour, use the spike rate as your average. 使用可以应对高峰速率的容量预配环境。Provision an environment with the capacity to handle the spike rate.

缓解限制和延迟Mitigate throttling and latency

有关如何避免限制和延迟的信息,请阅读缓解延迟和限制For information about how to prevent throttling and latency, read Mitigate latency and throttling.

塑造事件Shape your events

必须确保向 Azure 时序见解发送事件的方式支持预配的环境大小。It's important to ensure that the way you send events to Azure Time Series Insights supports the size of the environment you are provisioning. (相反,可将环境大小映射到 Azure 时序见解读取的事件数和每个事件的大小。)另外,必须考虑到在查询数据时要用作切片和筛选依据的属性。(Conversely, you can map the size of the environment to how many events Azure Time Series Insights reads and the size of each event.) It's also important to think about the attributes that you might want to use to slice and filter by when you query your data.

提示

请查看发送事件中的 JSON 塑形文档。Review the JSON shaping documentation in Sending events.

确保已获得参考数据Ensure that you have reference data

参考数据集是对来自事件源的事件进行补充的项集合。 A reference dataset is a collection of items that augment the events from your event source. Azure 时序见解流入引擎将来自事件源的每个事件与参考数据集中的相应数据行联接到一起。The Azure Time Series Insights ingress engine joins each event from your event source with the corresponding data row in your reference dataset. 然后可以查询补充后的事件。The augmented event is then available for query. 该联接基于参考数据集中定义的“主键”列。 The join is based on the Primary Key columns that are defined in your reference dataset.

备注

参考数据不以追溯方式进行联接。Reference data isn't joined retroactively. 在配置并上传参考数据集后,只会将当前和将来的流入数据与参考数据集相匹配并联接到其中。Only current and future ingress data is matched and joined to the reference dataset after it's configured and uploaded. 如果你打算将大量的历史数据发送到 Azure 时序见解,但未事先在 Azure 时序见解中上传或创建参考数据,到时可能需要从头开始,这是一件很麻烦的事。If you plan to send a large amount of historical data to Azure Time Series Insights and don't first upload or create reference data in Azure Time Series Insights, you might have to redo your work (hint: not fun).

若要详细了解如何在 Azure 时序见解中创建、上传和管理参考数据,请阅读参考数据集文档To learn more about how to create, upload, and manage your reference data in Azure Time Series Insights, read our Reference dataset documentation.

业务灾难恢复Business disaster recovery

本部分介绍即使发生了灾难,也能使应用和服务保持正常运行的 Azure 时序见解功能(称为“业务灾难恢复”)。 This section describes features of Azure Time Series Insights that keep apps and services running, even if a disaster occurs (known as business disaster recovery).

高可用性High availability

作为一项 Azure 服务,时序见解使用 Azure 区域级别的冗余提供某些高可用性功能。 As an Azure service, Time Series Insights provides certain high availability features by using redundancies at the Azure region level. 例如,Azure 支持通过其跨区域可用性功能来实现 灾难恢复功能。 For example, Azure supports disaster recovery capabilities through Azure's cross-region availability feature.

通过 Azure 提供的其他高可用性功能(以及同样适用于任何时序见解实例的功能)包括:Additional high-availability features provided through Azure (and also available to any Time Series Insights instance) include:

请务必启用相关的 Azure 功能,以便为设备和用户提供全局跨区域高可用性。Make sure you enable the relevant Azure features to provide global, cross-region high availability for your devices and users.

备注

如果已将 Azure 配置为启用跨区域可用性,则不需要在 Azure 时序见解中采用其他跨区域可用性配置。If Azure is configured to enable cross-region availability, no additional cross-region availability configuration is required in Azure Time Series Insights.

IoT 和事件中心IoT and event hubs

某些 Azure IoT 服务也包含内置的业务灾难恢复功能:Some Azure IoT services also include built-in business disaster recovery features:

将时序见解与其他服务集成有可能会提供更多的灾难恢复机制。Integrating Time Series Insights with the other services provides additional disaster recovery opportunities. 例如,可将发送到事件中心的遥测数据保留在 Azure Blob 存储备份数据库中。For example, telemetry sent to your event hub might be persisted to a backup Azure Blob storage database.

时序见解Time Series Insights

可通过多种方式使时序见解数据、应用和服务保持正常运行,即使发生中断。There are several ways to keep your Time Series Insights data, apps, and services running, even if they're disrupted.

但是,还可以确定是否需要出于以下目的,来创建 Azure 时序环境的完整备份副本:However, you might determine that a complete backup copy of your Azure Time Series environment also is required, for the following purposes:

  • 时序见解专门将数据和流量重定向到某个故障转移实例 As a failover instance specifically for Time Series Insights to redirect data and traffic to
  • 保留数据和审核信息To preserve data and auditing information

一般而言,复制时序见解环境的最佳方法是在备份 Azure 区域中创建另一个时序见解环境。In general, the best way to duplicate a Time Series Insights environment is to create a second Time Series Insights environment in a backup Azure region. 来自主要事件源的事件也会发送到此辅助环境。Events are also sent to this secondary environment from your primary event source. 请务必使用另一个专用使用者组。Make sure that you use a second, dedicated consumer group. 遵循前面所述的源业务灾难恢复指导原则。Follow that source's business disaster recovery guidelines, as described earlier.

若要创建副本环境:To create a duplicate environment:

  1. 在另一个区域中创建环境。Create an environment in a second region. 有关详细信息,请参阅在 Azure 门户中创建新的时序见解环境For more information, see Create a new Time Series Insights environment in the Azure portal.
  2. 为事件源创建另一个专用使用者组。Create a second dedicated consumer group for your event source.
  3. 将该事件源连接到新环境。Connect that event source to the new environment. 请务必指定第二个专用使用者组。Make sure that you designate the second, dedicated consumer group.
  4. 查看时序见解 IoT 中心事件中心文档。Review the Time Series Insights IoT Hub and Event Hubs documentation.

发生事件时:If an event occurs:

  1. 如果主要区域在灾难事件期间受到影响,请将操作重新路由到备用时序见解环境。If your primary region is affected during a disaster incident, reroute operations to the backup Time Series Insights environment.
  2. 使用第二个区域来备份和恢复所有时序见解遥测数据与查询数据。Use your second region to back up and recover all Time Series Insights telemetry and query data.

重要

发生故障转移时:If a failover occurs:

  • 也可能会发生延迟。A delay might also occur.
  • 由于需要重新路由操作,还可能会出现短暂的消息处理高峰。A momentary spike in message processing might occur, as operations are rerouted.

有关详细信息,请参阅缓解时序见解中的延迟For more information, see Mitigate latency in Time Series Insights.

后续步骤Next steps