使用流分析处理从 Application Insights 导出的数据Use Stream Analytics to process exported data from Application Insights

Azure 流分析是用于处理从 Application Insights 导出的数据的理想工具。Azure Stream Analytics is the ideal tool for processing data exported from Application Insights. 流分析可以从各种源提取数据。Stream Analytics can pull data from a variety of sources. 它可以转换和筛选数据,然后将其路由到各种接收器。It can transform and filter the data, and then route it to a variety of sinks.

在本示例中,我们将创建一个适配器用于从 Application Insights 提取数据,重命名和处理某些字段,然后通过管道将数据传送到 Power BI。In this example, we'll create an adaptor that takes data from Application Insights, renames and processes some of the fields, and pipes it into Power BI.

警告

还有便利得多的建议方法可在 Power BI 中显示 Application Insights 数据There are much better and easier recommended ways to display Application Insights data in Power BI. 本文所述的途径只是一个示例,演示如何处理导出的数据。The path illustrated here is just an example to illustrate how to process exported data.

通过 SA 导出到 PBI 的框图

在 Azure 中创建存储Create storage in Azure

连续导出始终将数据输出到 Azure 存储帐户,因此首先需要创建存储。Continuous export always outputs data to an Azure Storage account, so you need to create the storage first.

  1. Azure 门户上的订阅中创建一个“经典”存储帐户。Create a "classic" storage account in your subscription in the Azure portal.

    在 Azure 门户中,依次选择“添加”、“数据”、“存储”

  2. 创建容器Create a container

    在新存储中选择“容器”,单击“容器”磁贴,并单击“添加”

  3. 复制存储访问密钥Copy the storage access key

    稍后需要使用它来设置流分析服务的输入。You'll need it soon to set up the input to the stream analytics service.

    在存储中,依次打开“设置”、“密钥”,并复制主访问密钥

开始向 Azure 存储连续导出Start continuous export to Azure storage

连续导出会将数据从 Application Insights 移入 Azure 存储。Continuous export moves data from Application Insights into Azure storage.

  1. 在 Azure 门户中,浏览到为应用程序创建的 Application Insights 资源。In the Azure portal, browse to the Application Insights resource you created for your application.

    依次选择“浏览”、“Application Insights”、应用程序

  2. 创建连续导出。Create a continuous export.

    依次选择“设置”、“连续导出”、“添加”

    选择前面创建的存储帐户:Select the storage account you created earlier:

    设置导出目标

    设置想要查看的事件类型:Set the event types you want to see:

    选择事件类型

  3. 让我们累积一些数据。Let some data accumulate. 请休息一下,让其他人先使用该应用程序一段时间。Sit back and let people use your application for a while. 应用程序中会逐渐传入遥测数据,指标资源管理器中会显示统计图表,诊断搜索中会显示各个事件。Telemetry will come in and you'll see statistical charts in metric explorer and individual events in diagnostic search.

    此外,数据将导出到存储。And also, the data will export to your storage.

  4. 检查导出的数据Inspect the exported data. 在 Visual Studio 中,请选择“查看”>“Cloud Explorer”,并打开“Azure”>“存储”。In Visual Studio, choose View / Cloud Explorer, and open Azure / Storage. (如果没有此菜单选项,则需要安装 Azure SDK:打开“新建项目”对话框,打开 Visual C#/云/获取 Azure SDK for .NET。)(If you don't have this menu option, you need to install the Azure SDK: Open the New Project dialog and open Visual C# / Cloud / Get Azure SDK for .NET.)

    屏幕截图中显示了如何设置要查看的事件类型。

    记下派生自应用程序名称和检测密钥的路径名称的共同部分。Make a note of the common part of the path name, which is derived from the application name and instrumentation key.

事件以 JSON 格式写入 Blob 文件。The events are written to blob files in JSON format. 每个文件可能包含一个或多个事件。Each file may contain one or more events. 因此我们想要读取事件数据,并筛选出所需的字段。So we'd like to read the event data and filter out the fields we want. 可以针对数据执行各种操作,但我们目前的计划是使用流分析通过管道将数据传送到 Power BI。There are all kinds of things we could do with the data, but our plan today is to use Stream Analytics to pipe the data to Power BI.

创建 Azure 流分析实例Create an Azure Stream Analytics instance

Azure 门户中,选择 Azure 流分析服务,并创建新的流分析作业:From the Azure portal, select the Azure Stream Analytics service, and create a new Stream Analytics job:

屏幕截图中显示了 Azure 门户中用于创建流分析作业的主页。

屏幕截图中显示了创建新的流分析作业时所需的详细信息。

创建新作业后,选择“转到资源”。When the new job is created, select Go to resource.

屏幕截图中显示了新的流分析作业部署成功时接收到的消息。

添加新输入Add a new input

屏幕截图中显示了如何向流分析作业添加输入。

将此位置设置为从连续导出 Blob 接收输入:Set it to take input from your Continuous Export blob:

屏幕截图中显示了如何配置流分析作业以从连续导出 blob 中获取输入。

现在需要使用存储帐户的主访问密钥(前面已记下此密钥)。Now you'll need the Primary Access Key from your Storage Account, which you noted earlier. 将此密钥设置为存储帐户密钥。Set this as the Storage Account Key.

设置路径前缀模式Set path prefix pattern

请务必将“日期格式”设置为 YYYY-MM-DD(包含短划线)。Be sure to set the Date Format to YYYY-MM-DD (with dashes).

“路径前缀模式”指定流分析在存储中的哪个位置查找输入文件。The Path Prefix Pattern specifies where Stream Analytics finds the input files in the storage. 需要将它设置为与连续导出存储数据的方式相对应。You need to set it to correspond to how Continuous Export stores the data. 设置如下:Set it like this:

webapplication27_12345678123412341234123456789abcdef0/PageViews/{date}/{time}

在本示例中:In this example:

  • webapplication27 是 Application Insights 资源的名称,采用全小写webapplication27 is the name of the Application Insights resource all lower case.
  • 1234... 是 Application Insights 资源的检测密钥,但 省略了短划线1234... is the instrumentation key of the Application Insights resource, omitting dashes.
  • PageViews 是要分析的数据类型。PageViews is the type of data you want to analyze. 可用的类型取决于在连续导出中设置的筛选器。The available types depend on the filter you set in Continuous Export. 检查导出的数据以查看其他可用类型,并查看导出数据模型Examine the exported data to see the other available types, and see the export data model.
  • /{date}/{time} 是以文本形式写入的模式。/{date}/{time} is a pattern written literally.

备注

检查存储,确保路径正确。Inspect the storage to make sure you get the path right.

添加新输出Add new output

现在选择作业 >“输出” > “添加”。Now select your job > Outputs > Add.

屏幕截图中显示了如何选择流分析作业以添加新输出。

选择新通道,并依次单击“输出”、“添加”、“Power BI”

提供 工作或学校帐户,以授权流分析访问 Power BI 资源。Provide your work or school account to authorize Stream Analytics to access your Power BI resource. 然后为输出、目标 Power BI 数据集和表指定名称。Then invent a name for the output, and for the target Power BI dataset and table.

设置查询Set the query

查询控制从输入到输出的转换。The query governs the translation from input to output.

使用“测试”功能检查输出是否正确。Use the Test function to check that you get the right output. 在测试中提供从输入页获取的示例数据。Give it the sample data that you took from the inputs page.

用于显示事件计数的查询Query to display counts of events

粘贴以下查询:Paste this query:

SELECT
  flat.ArrayValue.name,
  count(*)
INTO
  [pbi-output]
FROM
  [export-input] A
OUTER APPLY GetElements(A.[event]) as flat
GROUP BY TumblingWindow(minute, 1), flat.ArrayValue.name
  • export-input 是为流输入指定的别名export-input is the alias we gave to the stream input
  • pbi-output 是定义的输出别名pbi-output is the output alias we defined
  • 之所以使用 OUTER APPLY GetElements,是因为事件名称在嵌套的 JSON 数组中。We use OUTER APPLY GetElements because the event name is in a nested JSON array. 然后,使用“选择”来选择事件名称,以及在相应时间段内使用该名称的实例计数。Then the Select picks the event name, together with a count of the number of instances with that name in the time period. Group By 子句将元素分组成以 1 分钟为单位的时间段。The Group By clause groups the elements into time periods of one minute.

用于显示指标值的查询Query to display metric values

SELECT
  A.context.data.eventtime,
  avg(CASE WHEN flat.arrayvalue.myMetric.value IS NULL THEN 0 ELSE  flat.arrayvalue.myMetric.value END) as myValue
INTO
  [pbi-output]
FROM
  [export-input] A
OUTER APPLY GetElements(A.context.custom.metrics) as flat
GROUP BY TumblingWindow(minute, 1), A.context.data.eventtime
  • 此查询将钻取指标遥测数据,获取事件时间和指标值。This query drills into the metrics telemetry to get the event time and the metric value. 指标值在数组内部,因此我们使用了 OUTER APPLY GetElements 模式来提取行。The metric values are inside an array, so we use the OUTER APPLY GetElements pattern to extract the rows. 在本例中,“myMetric”是指标的名称。"myMetric" is the name of the metric in this case.

用于包含维度属性值的查询Query to include values of dimension properties

WITH flat AS (
SELECT
  MySource.context.data.eventTime as eventTime,
  InstanceId = MyDimension.ArrayValue.InstanceId.value,
  BusinessUnitId = MyDimension.ArrayValue.BusinessUnitId.value
FROM MySource
OUTER APPLY GetArrayElements(MySource.context.custom.dimensions) MyDimension
)
SELECT
  eventTime,
  InstanceId,
  BusinessUnitId
INTO AIOutput
FROM flat
  • 此查询包含不依赖于特定维度的维度属性的值,该维度位于维度数组中某个固定的索引处。This query includes values of the dimension properties without depending on a particular dimension being at a fixed index in the dimension array.

运行作业Run the job

可以选择从过去的某个日期启动作业。You can select a date in the past to start the job from.

选择作业,并单击“查询”。

等待作业运行。Wait until the job is Running.

在 Power BI 中查看结果See results in Power BI

警告

还有便利得多的建议方法可在 Power BI 中显示 Application Insights 数据There are much better and easier recommended ways to display Application Insights data in Power BI. 本文所述的途径只是一个示例,演示如何处理导出的数据。The path illustrated here is just an example to illustrate how to process exported data.

使用工作或学校帐户打开 Power BI,并选择已定义为流分析作业输出的数据集和表。Open Power BI with your work or school account, and select the dataset and table that you defined as the output of the Stream Analytics job.

在 Power BI 中选择数据集和字段。

现在,可以在 Power BI 中的报告和仪表板内使用此数据集。Now you can use this dataset in reports and dashboards in Power BI.

屏幕截图中显示了根据 Power BI 中的数据集生成的报告示例。

没有数据?No data?

后续步骤Next steps