在 Azure 上选择实时分析和流式处理技术Choose a real-time analytics and streaming processing technology on Azure

Azure 上提供了用于实时分析和流式处理的多种服务。There are several services available for real-time analytics and streaming processing on Azure. 本文中的信息可帮助你确定哪种技术最适合你的应用程序。This article provides the information you need to decide which technology is the best fit for your application.

何时使用 Azure 流分析When to use Azure Stream Analytics

Azure 流分析是 Azure 上提供的建议用于流分析的服务。Azure Stream Analytics is the recommended service for stream analytics on Azure. 该服务适用于多种方案,包括但不限于:It's meant for a wide range of scenarios that include but aren't limited to:

  • 数据可视化仪表板Dashboards for data visualization
  • 针对时态和空间模式或异常生成实时警报Real-time alerts from temporal and spatial patterns or anomalies
  • 提取、转换、加载 (ETL)Extract, Transform, Load (ETL)
  • IoT EdgeIoT Edge

将 Azure 流分析作业添加到应用程序,是在 Azure 中使用熟悉的 SQL 语言启动和运行流分析的最快捷方法。Adding an Azure Stream Analytics job to your application is the fastest way to get streaming analytics up and running in Azure, using the SQL language you already know. Azure 流分析是一个作业服务,因此你无需花费时间来管理群集,也无需担心出现停机,因为它在作业级别提供 99.9% SLA。Azure Stream Analytics is a job service, so you don't have to spend time managing clusters, and you don't have to worry about downtime with a 99.9% SLA at the job level. 还可以在作业级别进行计费,使创业成本降低(只需一个流单元),但流单元可缩放(最多可以购买 192 个)。Billing is also done at the job level making startup costs low (one Streaming Unit), but scalable (up to 192 Streaming Units). 运行少量的流分析作业比运行并维护群集要经济高效得多。It's much more cost effective to run a few Stream Analytics jobs than it is to run and maintain a cluster.

Azure 流分析提供丰富的全新体验。Azure Stream Analytics has a rich out-of-the-box experience. 无需进行任何额外的设置,就能立即利用以下功能:You can immediately take advantage of the following features without any additional setup:

  • 内置时态运算符,例如开窗聚合、时态联接和时态分析函数。Built-in temporal operators, such as windowed aggregates, temporal joins, and temporal analytic functions.
  • 本机 Azure 输入输出适配器Native Azure input and output adapters
  • 支持慢速变化的参考数据(也称为查找表),包括与地理围栏的地理空间参考数据相联接。Support for slow changing reference data (also known as a lookup tables), including joining with geospatial reference data for geofencing.
  • 集成的解决方案,例如异常情况检测Integrated solutions, such as Anomaly Detection
  • 同一查询中的多个时间窗口Multiple time windows in the same query
  • 可按任意顺序编写多个时态运算符。Ability to compose multiple temporal operators in arbitrary sequences.
  • 从输入抵达事件中心到输出进入事件中心的端到端延迟不超过 100 毫秒(包括与事件中心之间的网络延迟),且能保持较高的吞吐量Under 100-ms end-to-end latency from input arriving at Event Hubs, to output landing in Event Hubs, including the network delay from and to Event Hubs, at sustained high throughput

何时使用其他技术When to use other technologies

要使用除 JavaScript 以外的语言编写 UDF、UDA 和自定义反序列化程序You want to write UDFs, UDAs, and custom deserializers in a language other than JavaScript or C#

Azure 流分析支持在适用于云作业的 JavaScript 中以及适用于 IoT Edge 作业的 C# 中使用用户定义的函数 (UDF) 或用户定义的聚合 (UDA)。Azure Stream Analytics supports user-defined functions (UDF) or user-defined aggregates (UDA) in JavaScript for cloud jobs and C# for IoT Edge jobs. 还支持 C# 用户定义的反序列化程序。C# user-defined deserializers are also supported. 若要在其他语言(例如 Java 或 Python)中实现反序列化程序、UDF 或 UDA,可以使用 Spark 结构化流。If you want to implement a deserializer, a UDF, or a UDA in other languages, such as Java or Python, you can use Spark Structured Streaming. 还可以在自己的虚拟机上运行事件中心 EventProcessorHost,以执行任意流式处理。You can also run the Event Hubs EventProcessorHost on your own virtual machines to do arbitrary streaming processing.

解决方案位于多云环境或本地环境中Your solution is in a multi-cloud or on-premises environment

Azure 流分析是 Microsoft 的专属技术,只能在 Azure 上使用。Azure Stream Analytics is Microsoft's proprietary technology and is only available on Azure. 如果需要在云之间或本地位置之间移植解决方案,请考虑使用 Spark 结构化流或 Storm 等开源技术。If you need your solution to be portable across Clouds or on-premises, consider open-source technologies such as Spark Structured Streaming or Storm.

后续步骤Next steps