什么是 Azure 流分析?What is Azure Stream Analytics?

Azure 流分析是一个实时分析和复杂事件处理引擎,旨在同时分析和处理来自多个源的大量快速流式处理数据。Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. 可以在从许多输入源(包括设备、传感器、点击流、社交媒体源和应用程序)提取的信息中识别模式和关系。Patterns and relationships can be identified in information extracted from a number of input sources including devices, sensors, clickstreams, social media feeds, and applications. 这些模式可用于触发操作和启动工作流,例如创建警报、向报告工具馈送信息或存储转换后的数据供以后使用。These patterns can be used to trigger actions and initiate workflows such as creating alerts, feeding information to a reporting tool, or storing transformed data for later use. 此外,流分析可在 Azure IoT Edge 运行时上使用,从而能够处理 IoT 设备上的数据。Also, Stream Analytics is available on Azure IoT Edge runtime, enabling to process data on IoT devices.

下面是可以使用 Azure 流分析的示例场景:The following scenarios are examples of when you can use Azure Stream Analytics:

  • 分析来自 IoT 设备的实时遥测数据流Analyze real-time telemetry streams from IoT devices
  • Web 日志/点击流分析Web logs/clickstream analytics
  • 适用于车队管理和无人驾驶汽车的地理空间分析Geospatial analytics for fleet management and driverless vehicles
  • 远程监视和预测性维护高价值资产Remote monitoring and predictive maintenance of high value assets
  • 实时分析销售点数据,以便进行库存控制和异常情况检测Real-time analytics on Point of Sale data for inventory control and anomaly detection

流分析工作原理How does Stream Analytics work?

Azure 流分析作业由输入、查询和输出构成。An Azure Stream Analytics job consists of an input, query, and an output. 流分析从 Azure 事件中心(包括来自 Apache Kafka 的 Azure 事件中心)、Azure IoT 中心或 Azure Blob 存储中引入数据。Stream Analytics ingests data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. 基于 SQL 查询语言的查询可用于对某个时段的流数据轻松进行筛选、排序、聚合和联接。The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. 还可以使用 JavaScript 和 C# 用户定义函数 (UDF) 扩展此 SQL 语言。You can also extend this SQL language with JavaScript and C# user defined functions (UDFs). 通过简单的语言构造和/或配置执行聚合操作时,可以轻松地调整事件排序选项和时间窗口的持续时间。You can easily adjust the event ordering options and duration of time windows when preforming aggregation operations through simple language constructs and/or configurations.

每个作业都有一个或多个转换后数据的输出,你可以针对所要分析的信息控制过程。Each job has one or several outputs for the transformed data, and you can control what happens in response to the information you've analyzed. 例如,你能够:For example, you can:

  • 将数据发送到 Azure Functions、服务总线主题或队列等服务,以触发下游的通信或自定义工作流。Send data to services such as Azure Functions, Service Bus Topics or Queues to trigger communications or custom workflows downstream.
  • 将数据发送到 Power BI 仪表板进行实时仪表板操作。Send data to a Power BI dashboard for real-time dashboarding.
  • 将数据存储到其他 Azure 存储服务(例如 Azure Data Lake、Azure Synapse Analytics 等)中,以基于历史数据训练机器学习模型或执行批处理分析。Store data in other Azure storage services (e.g. Azure Data Lake, Azure Synapse Analytics, etc.) to train a machine learning model based on historical data or perform batch analytics.

下图说明了如何将数据发送到流分析,在进行分析后再发送到其他位置进行其他操作(例如存储或演示):The following image shows how data is sent to Stream Analytics, analyzed, and sent for other actions like storage or presentation:

流分析介绍管道

主要功能和优点Key capabilities and benefits

Azure 流分析经过专门的设计,具有易用、灵活、可靠的特点,并可根据作业大小进行缩放。Azure Stream Analytics is designed to be easy to use, flexible, reliable, and scalable to any job size. 多个 Azure 区域均提供该服务。It is available across multiple Azure regions. 下图演示了 Azure 流分析的重要功能:The following image illustrates the key capabilities of Azure Stream Analytics:

流分析重要功能

易于入门Ease of getting started

Azure 流分析易于入门。Azure Stream Analytics is easy to start. 只需点击几下鼠标即可连接到多个源和接收器并创建端到端的管道。It only takes a few clicks to connect to multiple sources and sinks, creating an end-to-end pipeline. 流分析可连接到 Azure 事件中心Azure IoT 中心来引入流数据,并可连接到 Azure Blob 存储来引入历史数据。Stream Analytics can connect to Azure Event Hubs and Azure IoT Hub for streaming data ingestion, as well as Azure Blob storage to ingest historical data. 作业输入还可以包含 Azure Blob 存储或 SQL 数据库中的静态数据或缓慢更改的参考数据,可将这些数据与流数据相联接,以执行查找操作。Job input can also include static or slow-changing reference data from Azure Blob storage or SQL Database that you can join to streaming data to perform lookup operations.

流分析可以将作业输出路由到许多存储系统(例如 Azure Blob 存储Azure SQL 数据库)。你可以使用 Azure HDInsight 对存储的输出运行批处理分析,也可以将输出发送到另一个服务(例如事件中心)供使用Stream Analytics can route job output to many storage systems such as Azure Blob storage, Azure SQL Database, You can run batch analytics on stored output with Azure HDInsight, or you can send the output to another service, like Event Hubs for consumption

有关流分析输出的完整列表,请参阅了解 Azure 流分析的输出For the entire list of Stream Analytics outputs, see Understand outputs from Azure Stream Analytics.

程序员工作效率Programmer productivity

Azure 流分析使用简单的基于 SQL 的查询语言,该语言已使用强大的时态约束进行强化,可以分析动态数据。Azure Stream Analytics uses a simple SQL-based query language that has been augmented with powerful temporal constraints to analyze data in motion. 若要定义作业转换,请使用简单的声明性流分析查询语言,以便通过简单的 SQL 构造创作复杂的时态查询和分析。To define job transformations, you use a simple, declarative Stream Analytics query language that lets you author complex temporal queries and analytics using simple SQL constructs. 由于流分析查询语言与 SQL 语言相一致,因此,熟悉 SQL 就足以开始创建作业。Because Stream Analytics query language is consistent to the SQL language, familiarity with SQL is sufficient to start creating jobs. 也可使用 Azure PowerShell 或 Azure 资源管理器模板等开发人员工具来创建作业。You can also create jobs by using developer tools like Azure PowerShell, or Azure Resource Manager templates.

流分析查询语言提供各种用于分析和处理流数据的功能。The Stream Analytics query language offers a wide array of functions for analyzing and processing streaming data. 此查询语言支持简单的数据操作、聚合和分析函数、地理空间函数模式匹配异常情况检测This query language supports simple data manipulation, aggregation and analytics functions, geospatial functions, pattern matching and anomaly detection. 可以在门户中编辑查询,然后使用从实时流中提取的示例数据来测试它们。You can edit queries in the portal and test them using sample data that is extracted from a live stream.

可以通过定义和调用其他函数来扩展查询语言的功能。You can extend the capabilities of the query language by defining and invoking additional functions. 可以在 Azure 机器学习中定义函数调用,以便利用 Azure 机器学习解决方案,还可以集成 JavaScript 或 C# 用户定义的函数 (UDF) 或用户定义的聚合,以便在流分析查询中执行复杂的计算。You can define function calls in the Azure Machine Learning to take advantage of Azure Machine Learning solutions, and integrate JavaScript or C# user-defined functions (UDFs) or user-defined aggregates to perform complex calculations as part a Stream Analytics query.

完全托管Fully managed

Azure 流分析是 Azure 中的一项完全托管的无服务器 (PaaS) 产品/服务。Azure Stream Analytics is a fully managed serverless (PaaS) offering on Azure. 你无需预配任何硬件,无需管理运行作业的群集,也无需更新 OS 或软件。You don't have to provision any hardware, manage clusters to run your jobs, or update OS or software. Azure 流分析完全管理你的作业,因此你可以专注于业务逻辑,而不是基础结构。Azure Stream Analytics fully manages your job, so you can focus on your business logic and not on the infrastructure.

在云中或智能边缘上运行Run in the cloud or on the intelligent edge

Azure 流分析可以在云中运行,实现大规模分析;还可以在 IoT Edge 上运行,实现超低延迟分析。Azure Stream Analytics can run in the cloud, for large-scale analytics, or run on IoT Edge for ultra-low latency analytics. Azure 流分析在云和边缘上采用同一种工具和查询语言,让开发者能够生成用于流处理的真正混合体系结构。Azure Stream Analytics uses the same tools and query language on both cloud and the edge, enabling developers to build truly hybrid architectures for stream processing.

总拥有成本低廉Low total cost of ownership

流分析已作为云服务进行成本优化。As a cloud service, Stream Analytics is optimized for cost. 没有前期费用,只需为所使用的流单元付费。There are no upfront costs involved - you only pay for the streaming units you consume. 无需承诺使用量,也无需预配群集,可以根据业务需求纵向缩放作业。There is no commitment or cluster provisioning required, and you can scale the job up or down based on your business needs.

关键任务就绪Mission-critical ready

Azure 流分析可在全球多个地区使用,旨在通过支持可靠性、安全性和符合性要求来运行关键任务工作负载。Azure Stream Analytics is available across multiple regions worldwide and is designed to run mission-critical workloads by supporting reliability, security and compliance requirements.

可靠性Reliability

Azure 流分析保证刚好进行一次事件处理,以及至少进行一次事件传送,因此事件不会丢失。Azure Stream Analytics guarantees exactly-once event processing and at-least-once delivery of events, so events are never lost. 使用事件传递保证中所述的选定输出保证刚好处理一次。Exactly-once processing is guaranteed with selected output as described in Event Delivery Guarantees.

Azure 流分析有内置的恢复功能,可以在事件传送失败时发挥作用。Azure Stream Analytics has built-in recovery capabilities in case the delivery of an event fails. 流分析还提供内置的检查点来维护作业的状态,并提供可重复的结果。Stream Analytics also provides built-in checkpoints to maintain the state of your job and provides repeatable results.

作为一项托管服务,流分析可保证事件处理在分钟级别粒度具备 99.9% 的可用性。As a managed service, Stream Analytics guarantees event processing with a 99.9% availability at a minute level of granularity. 有关详细信息,请参阅流分析 SLA 页。For more information, see the Stream Analytics SLA page.

安全性Security

在安全性方面,Azure 流分析会加密所有传入和传出通信,并支持 TLS 1.2。In terms of security, Azure Stream Analytics encrypts all incoming and outgoing communications and supports TLS 1.2. 内置检查点也是加密的。Built-in checkpoints are also encrypted. 流分析不存储传入数据,因为所有处理都在内存中完成。Stream Analytics doesn't store the incoming data since all processing is done in-memory.

合规性Compliance

Azure 流分析遵循多个符合性认证,如 Azure 符合性概述中所述。Azure Stream Analytics follows multiple compliance certifications as described in the overview of Azure compliance.

性能Performance

流分析可以每秒处理数百万事件,而且传送结果时的延迟也极低。Stream Analytics can process millions of events every second and it can deliver results with ultra low latencies. 可以通过它进行纵向扩展和横向扩展,以便操控实时且复杂的大型事件处理应用程序。It allows you to scale-up and scale-out to handle large real-time and complex event processing applications. 流分析通过分区支持高性能,允许将复杂的查询并行化,并在多个流式处理节点上执行这些查询。Stream Analytics supports higher performance by partitioning, allowing complex queries to be parallelized and executed on multiple streaming nodes. Azure 流分析基于 Trill,这是一种与 Microsoft Research 合作开发的高性能内存中流式处理分析引擎。Azure Stream Analytics is built on Trill, a high-performance in-memory streaming analytics engine developed in collaboration with Microsoft Research.

后续步骤Next steps

你现在已对 Azure 流分析有了一个大致的了解。You now have an overview of Azure Stream Analytics. 接下来,你可以进行深入了解并创建第一个流分析作业:Next, you can dive deep and create your first Stream Analytics job: