使用 Azure 机器学习工作室(经典)函数缩放流分析作业Scale your Stream Analytics job with Azure Machine Learning Studio (classic) functions

本文介绍如何有效缩放使用 Azure 机器学习函数的 Azure 流分析作业。This article discusses how to efficiently scale Azure Stream Analytics jobs that use Azure Machine Learning functions. 有关如何缩放流分析作业的常规信息,请参阅文章 缩放作业For information on how to scale Stream Analytics jobs in general see the article Scaling jobs.

什么是流分析中的 Azure 机器学习函数?What is an Azure Machine Learning function in Stream Analytics?

流分析中机器学习函数的用法与流分析查询语言中常规函数调用的用法类似。A Machine Learning function in Stream Analytics can be used like a regular function call in the Stream Analytics query language. 但在幕后,函数调用实际上是 Azure 机器学习 Web 服务请求。Behind the scenes, however, these function calls are actually Azure Machine Learning Web Service requests.

可以通过在同一个 Web 服务 API 调用中“批处理”多个行来提高机器学习 Web 服务请求的吞吐量。You can improve the throughput of Machine Learning web service requests by "batching" multiple rows together in the same web service API call. 这种分组称为微型批。This grouping is called a mini-batch. 有关详细信息,请参阅 Azure 机器学习工作室(经典)Web 服务For more information, see Azure Machine Learning Studio (classic) Web Services. 流分析中对 Azure 机器学习工作室(经典)的支持处于预览状态。Support for Azure Machine Learning Studio (classic) in Stream Analytics is in preview.

使用机器学习函数配置流分析作业Configure a Stream Analytics job with Machine Learning functions

可通过两个参数来配置流分析作业使用的机器学习函数:There are two parameters to configure the Machine Learning function used by your Stream Analytics job:

  • 机器学习函数调用的批大小。Batch size of the Machine Learning function calls.
  • 为流分析作业预配的流单元 (SU) 数。The number of Streaming Units (SUs) provisioned for the Stream Analytics job.

若要确定适当的 SU 值,请决定是要优化流分析作业的延迟,还是优化每个 SU 的吞吐量。To determine the appropriate values for SUs, decide whether you would like to optimize latency of the Stream Analytics job or the throughput of each SU. 可能会始终将 SU 添加到某个作业,以提高适当分区的流分析查询的吞吐量。SUs may always be added to a job to increase the throughput of a well-partitioned Stream Analytics query. 增加 SU 会增大运行作业的成本。Additional SUs do increase the cost of running the job.

确定流分析作业的延迟容限Determine the latency tolerance for your Stream Analytics job. 增加批大小会增大 Azure 机器学习请求的延迟以及流分析作业的延迟。Increasing the batch size will increase the latency of your Azure Machine Learning requests and the latency of the Stream Analytics job.

增加批大小可让流分析作业使用相同数量的机器学习 Web 服务请求来处理更多的事件Increasing the batch size allows the Stream Analytics job to process more events with the same number of Machine Learning web service requests. 机器学习 Web 服务延迟增大与批大小增大之间通常呈亚线性关系。The increase of Machine Learning web service latency is usually sublinear to the increase of batch size.

在任意给定情况下,必须考虑为机器学习 Web 服务选择最经济高效的批大小。It's important to consider the most cost-efficient batch size for a Machine Learning web service in any given situation. Web 服务请求的默认批大小为 1000。The default batch size for web service requests is 1000. 可以使用流分析 REST API适用于流分析的 PowerShell 客户端来更改此默认大小。You can change this default size using the Stream Analytics REST API or the PowerShell client for Stream Analytics.

确定批大小后,可根据函数每秒需要处理的事件数来设置流单元 (SU) 数量。Once you've decided on a batch size, you can set the number of streaming units (SUs), based on the number of events that the function needs to process per second. 有关流式处理单位的详细信息,请参阅流分析缩放作业For more information about streaming units, see Stream Analytics scale jobs.

每 6 个 SU 可与机器学习 Web 服务建立 20 个并发连接。Every 6 SUs get 20 concurrent connections to the Machine Learning web service. 但是,1 个 SU 作业和 3 个 SU 作业可建立 20 个并发连接。However, 1 SU job and 3 SU jobs get 20 concurrent connections.

如果应用程序每秒生成 200,000 个事件,并且批大小为 1000,则造成的 Web 服务延迟为 200 毫秒。If your application generates 200,000 events per second, and the batch size is 1000, then the resulting web service latency is 200 ms. 这种速率意味着,每个连接在每秒内可向机器学习 Web 服务发出 5 个请求。This rate means that every connection can make five requests to the Machine Learning web service each second. 通过 20 个连接,流分析作业可在 200 毫秒内处理 20,000 个事件,在 1 秒内可处理 100,000 个事件。With 20 connections, the Stream Analytics job can process 20,000 events in 200 ms and 100,000 events in a second.

若要每秒处理 200,000 个事件,流分析作业需要 40 个并发连接,也就是 12 个 SU。To process 200,000 events per second, the Stream Analytics job needs 40 concurrent connections, which come out to 12 SUs. 下图显示了从流分析作业到机器学习 Web 服务终结点的请求:每 6 个 SU 最多有 20 个到机器学习 Web 服务的并发连接。The following diagram illustrates the requests from the Stream Analytics job to the Machine Learning web service endpoint - Every 6 SUs has 20 concurrent connections to Machine Learning web service at max.

使用机器学习函数两作业缩放流分析的示例Scale Stream Analytics with Machine Learning Functions two job example

一般情况下,“B”代表批大小、“L”代表批大小为 B 时的 Web 服务延迟(以毫秒为单位),“N”个 SU 的流分析作业的吞吐量为******************:In general, B for batch size, L for the web service latency at batch size B in milliseconds, the throughput of a Stream Analytics job with N SUs is:

使用机器学习函数公式缩放流分析Scale Stream Analytics with Machine Learning Functions Formula

还可以在机器学习 Web 服务中配置“最大并发调用数”。You can also configure the 'max concurrent calls' on the Machine Learning web service. 建议将此参数设置为最大值(目前为 200)。It's recommended to set this parameter to the maximum value (200 currently).

有关此设置的详细信息,请参阅机器学习 Web 服务的缩放文章For more information on this setting, review the Scaling article for Machine Learning Web Services.

示例 - 观点分析Example - Sentiment Analysis

以下示例包括具有情绪分析机器学习功能的流分析作业。The following example includes a Stream Analytics job with the sentiment analysis Machine Learning function.

查询是简单的、已完全分区的查询,后跟情绪函数,如以下示例所示:The query is a simple fully partitioned query followed by the sentiment function, as shown in the following example:

    WITH subquery AS (
        SELECT text, sentiment(text) as result from input
    )

    Select text, result.[Score]
    Into output
    From subquery

让我们了解创建一个流分析作业所需的配置。该作业以每秒处理 10,000 个推文的速率对推文执行情绪分析。Let's examine the configuration necessary to create a Stream Analytics job, which does sentiment analysis of tweets at a rate of 10,000 tweets per second.

此流分析作业使用 1 个 SU 是否能够处理这种流量?Using 1 SU, could this Stream Analytics job handle the traffic? 使用默认批大小 1000,该作业应该能够跟上输入速度。The job can keep up with the input using the default batch size of 1000. 使用情绪分析机器学习 Web 服务(默认批大小为 1000)的默认延迟时,延迟不超过 1 秒。The default latency of the sentiment analysis Machine Learning web service (with a default batch size of 1000) creates no more than a second of latency.

流分析作业的 总延迟 或端到端延迟通常只有几秒钟。The Stream Analytics job's overall or end-to-end latency would typically be a few seconds. 深入了解此流分析作业, 尤其是 机器学习函数调用。Take a more detailed look into this Stream Analytics job, especially the Machine Learning function calls. 如果批大小为 1000,则 10,000 个事件的吞吐量将向 Web 服务发送大约 10 个请求。With a batch size of 1000, a throughput of 10,000 events takes about 10 requests to the web service. 即使使用一个 SU,也有足够的并发连接可以容纳此输入流量。Even with one SU, there are enough concurrent connections to accommodate this input traffic.

如果输入事件率增加 100 倍,而流分析作业需要每秒处理 1000000 条推文。If the input event rate increases by 100x, then the Stream Analytics job needs to process 1,000,000 tweets per second. 有两个选项来完成增加的规模:There are two options to accomplish the increased scale:

  1. 增加批大小。Increase the batch size.
  2. 将输入流分区以并行处理事件。Partition the input stream to process the events in parallel.

如果使用第一个选项,作业延迟将增加****。With the first option, the job latency increases.

如果使用第二个选项,则必须预配更多的 SU 才能发出更多的机器学习 Web 服务并发请求。With the second option, you will have to provision more SUs to have more concurrent Machine Learning web service requests. 增加 SU 也会增大作业成本This greater number of SUs, increases the job cost.

让我们了解如何使用每个批大小的以下延迟度量值进行缩放:Let's look at the scaling using the following latency measurements for each batch size:

延迟Latency 批大小Batch size
200 毫秒200 ms 包含 1000 个或更少事件的批1000-event batches or below
250 毫秒250 ms 包含 5,000 个事件的批5,000-event batches
300 毫秒300 ms 包含 1,000 个事件的批10,000-event batches
500 毫秒500 ms 包含 25,000 个事件的批25,000-event batches
  1. 使用第一个选项(不**** 预配更多的 SU)。Using the first option (not provisioning more SUs). 批大小可能会增加到 25000****。The batch size could be increased to 25,000. 以这种方式增加批大小可让作业处理 1,000,000 个事件,并且可与机器学习 Web 服务建立 20 个并发连接(每个调用的延迟为 500 毫秒)。Increasing the batch size in this way will allow the job to process 1,000,000 events with 20 concurrent connections to the Machine Learning web service (with a latency of 500 ms per call). 所以,由于要向机器学习 Web 服务发出情绪函数请求,流分析作业的额外延迟将从 200 毫秒增加到 500 毫秒**** ****。So the additional latency of the Stream Analytics job due to the sentiment function requests against the Machine Learning web service requests would be increased from 200 ms to 500 ms. 然而,批大小不可无限制增加,因为机器学习 Web 服务要求 Web 服务请求在 100 秒操作超时后,请求的有效负载大小为 4 MB 或更小。However, batch size can't be increased infinitely as the Machine Learning web services requires the payload size of a request be 4 MB or smaller, and web service requests timeout after 100 seconds of operation.
  2. 如果使用第二个选项,批大小仍为 1000,Web 服务延迟为 200 毫秒,每 20 个 Web 服务的并发连接每秒能够处理 1000 * 20 * 5 个事件,即 100,000 个事件。Using the second option, the batch size is left at 1000, with 200-ms web service latency, every 20 concurrent connections to the web service would be able to process 1000 * 20 * 5 events = 100,000 per second. 因此,若要每秒处理 1,000,000 个事件,作业需要 60 个 SU。So to process 1,000,000 events per second, the job would need 60 SUs. 与第一个选项相比,流分析作业会提出更多 Web 服务批处理请求,从而导致成本增加。Compared to the first option, Stream Analytics job would make more web service batch requests, in turn generating an increased cost.

下表介绍了不同 SU 和批大小(以每秒事件数为单位)的流分析作业的吞吐量。Below is a table for the throughput of the Stream Analytics job for different SUs and batch sizes (in number of events per second).

批大小(ML 延迟)batch size (ML latency) 500(200 毫秒)500 (200 ms) 1000(200 毫秒)1,000 (200 ms) 5,000(250 毫秒)5,000 (250 ms) 10,000(300 毫秒)10,000 (300 ms) 25,000(500 毫秒)25,000 (500 ms)
1 个 SU1 SU 2,5002,500 5,0005,000 20,00020,000 30,00030,000 50,00050,000
3 个 SU3 SUs 2,5002,500 5,0005,000 20,00020,000 30,00030,000 50,00050,000
6 个 SU6 SUs 2,5002,500 5,0005,000 20,00020,000 30,00030,000 50,00050,000
12 个 SU12 SUs 5,0005,000 10,00010,000 40,00040,000 60,00060,000 100,000100,000
18 个 SU18 SUs 7,5007,500 15,00015,000 60,00060,000 90,00090,000 150,000150,000
24 个 SU24 SUs 10,00010,000 20,00020,000 80,00080,000 120,000120,000 200,000200,000
60 个 SU60 SUs 25,00025,000 50,00050,000 200,000200,000 300,000300,000 500,000500,000

到目前为止,应该已经对流分析中机器学习函数的工作方式有了较好的了解。By now, you should already have a good understanding of how Machine Learning functions in Stream Analytics work. 可能还知道流分析作业从数据源中“拉取”数据,并且每次“拉取”都会返回一批供流分析作业处理的事件。You likely also understand that Stream Analytics jobs "pull" data from data sources and each "pull" returns a batch of events for the Stream Analytics job to process. 这种拉取模型如何影响机器学习 Web 服务请求?How does this pull model impact the Machine Learning web service requests?

通常情况下,我们为机器学习函数设置的批大小不会被每个流分析作业“拉取”返回的事件数整除。Normally, the batch size we set for Machine Learning functions won't exactly be divisible by the number of events returned by each Stream Analytics job "pull". 如果发生这种情况,机器学习 Web 服务将通过“部分”批处理调用。When this occurs, the Machine Learning web service is called with "partial" batches. 使用部分批可避免在合并拉取间的事件时造成其他作业延迟开销。Using partial batches avoids incurring additional job latency overhead in coalescing events from pull to pull.

在流分析作业的“监视”区域中,已添加了其他三个的函数相关的指标。In the Monitor area of a Stream Analytics job, three additional function-related metrics have been added. 它们是“函数请求数”、“函数事件数”和“失败的函数请求数”,如下图所示**** **** ****。They are FUNCTION REQUESTS, FUNCTION EVENTS and FAILED FUNCTION REQUESTS, as shown in the graphic below.

使用机器学习函数指标缩放流分析Scale Stream Analytics with Machine Learning Functions Metrics

它们的定义如下:The are defined as follows:

函数请求数:函数请求的数量。FUNCTION REQUESTS: The number of function requests.

函数事件数:函数请求数中的事件数。FUNCTION EVENTS: The number events in the function requests.

失败的函数请求数:失败的函数请求数量。FAILED FUNCTION REQUESTS: The number of failed function requests.

关键点Key Takeaways

若要使用机器学习函数缩放流分析作业,请考虑以下因素:To scale a Stream Analytics job with Machine Learning functions, consider the following factors:

  1. 输入事件速率。The input event rate.
  2. 运行流分析作业允许延迟(因而影响了机器学习 Web 服务的批大小)。The tolerated latency for the running Stream Analytics job (and thus the batch size of the Machine Learning web service requests).
  3. 预配的流分析 SU 和机器学习 Web 服务请求数(与函数相关的额外成本)。The provisioned Stream Analytics SUs and the number of Machine Learning web service requests (the additional function-related costs).

以完全分区的流分析查询为例。A fully partitioned Stream Analytics query was used as an example. 如果需要更复杂的查询,有关 Azure 流分析的 Microsoft Q&A 问题页面是一项绝佳资源,可以获取流分析团队的额外帮助。If a more complex query is needed, the Microsoft Q&A question page for Azure Stream Analytics is a great resource for getting additional help from the Stream Analytics team.

后续步骤Next steps

若要了解流分析的更多内容,请参阅:To learn more about Stream Analytics, see: