使用 Azure Cosmos DB 和 Power BI 创建实时仪表板Create a real-time dashboard using Azure Cosmos DB and Power BI

本文介绍使用 Azure Cosmos DB 和 Azure Analysis Services 在 Power BI 中创建实时天气仪表板所需执行的步骤。This article describes the steps required to create a live weather dashboard in Power BI using Azure Cosmos DB and Azure Analysis Services. Power BI 仪表板将会显示图表,以便显示有关区域中的气温和降水量的实时信息。The Power BI dashboard will display charts to show real-time information about temperature and rainfall in a region.

报告方案Reporting scenarios

可通过多种方式基于 Azure Cosmos DB 中存储的数据设置报告仪表板。There are multiple ways to set up reporting dashboards on data stored in Azure Cosmos DB. 下表根据数据的过时要求和大小描述了每种方案的报告设置:Depending on the staleness requirements and the size of the data, the following table describes the reporting setup for each scenario:

方案Scenario 设置Setup
1.生成即席报表(不刷新)1. Generating ad-hoc reports (no refresh) 采用导入模式的 Power BI Azure Cosmos DB 连接器Power BI Azure Cosmos DB connector with import mode
2.生成定期刷新的即席报表2. Generating ad-hoc reports with periodic refresh 采用导入模式的 Power BI Azure Cosmos DB 连接器(按计划定期刷新)Power BI Azure Cosmos DB connector with import mode (Scheduled periodic refresh)
3.针对大型数据集 (< 10 GB) 进行报告3. Reporting on large data sets (< 10 GB) 支持增量刷新的 Power BI Azure Cosmos DB 连接器Power BI Azure Cosmos DB connector with incremental refresh
4.针对大型数据集实时报告4. Reporting real time on large data sets 支持直接查询的 Power BI Azure Analysis Services 连接器 + Azure Analysis Services(Azure Cosmos DB 连接器)Power BI Azure Analysis Services connector with direct query + Azure Analysis Services (Azure Cosmos DB connector)
5.针对包含聚合值的实时数据进行报告5. Reporting on live data with aggregates 支持直接查询的 Power BI Spark 连接器 + Azure Databricks + Cosmos DB Spark 连接器。Power BI Spark connector with direct query + Azure Databricks + Cosmos DB Spark connector.
6.针对大型数据集中包含聚合值的实时数据进行报告6. Reporting on live data with aggregates on large data sets 支持直接查询的 Power BI Azure Analysis Services 连接器 + Azure Analysis Services + Azure Databricks + Cosmos DB Spark 连接器。Power BI Azure Analysis Services connector with direct query + Azure Analysis Services + Azure Databricks + Cosmos DB Spark connector.

可以使用 Azure Cosmos DB Power BI 连接器轻松设置方案 1 和 2。Scenarios 1 and 2 can be easily set up using the Azure Cosmos DB Power BI connector. 本文介绍方案 3 和 4 的设置。This article describes the setups for scenarios 3 and 4.

支持增量刷新的 Power BIPower BI with incremental refresh

Power BI 提供一种可以配置增量刷新的模式。Power BI has a mode where incremental refresh can be configured. 采用此模式便无需创建和管理 Azure Analysis Services 分区了。This mode eliminates the need to create and manage Azure Analysis Services partitions. 可将增量刷新设置为仅筛选大型数据集中的最新更新。Incremental refresh can be set up to filter only the latest updates in large datasets. 但是,此模式仅适用于数据集大小上限为 10 GB 的 Power BI Premium 服务。However, this mode works only with Power BI Premium service that has a dataset limitation of 10 GB.

Power BI Azure Analysis 连接器 + Azure Analysis ServicesPower BI Azure Analysis connector + Azure Analysis Services

Azure Analysis Services 提供一个完全托管的平台即服务,用于在云中承载企业级数据模型。Azure Analysis Services provides a fully managed platform as a service that hosts enterprise-grade data models in the cloud. 可将巨量数据集从 Azure Cosmos DB 载入 Azure Analysis Services。Massive data sets can be loaded from Azure Cosmos DB into Azure Analysis Services. 为了避免总要查询整个数据集,可将数据集细分到 Azure Analysis Services 分区中,而这些分区可按照不同的频率独立刷新。To avoid querying the entire dataset all the time, the datasets can be subdivided into Azure Analysis Services partitions, which can be refreshed independently at different frequencies.

Power BI 增量刷新Power BI incremental refresh

将天气数据引入 Azure Cosmos DBIngest weather data into Azure Cosmos DB

设置一个引入管道,以将天气数据加载到 Azure Cosmos DB 中。Set up an ingestion pipeline to load weather data to Azure Cosmos DB. 可以设置一个 Azure 数据工厂 (ADF) 作业,以使用 HTTP 源和 Cosmos DB 接收器定期将最新的天气数据载入 Azure Cosmos DB。You can set up an Azure Data Factory (ADF) job to periodically load the latest weather data into Azure Cosmos DB using the HTTP Source and Cosmos DB sink.

将 Power BI 连接到 Azure Cosmos DBConnect Power BI to Azure Cosmos DB

  1. 将 Azure Cosmos 帐户连接到 Power BI - 打开 Power BI Desktop,使用 Azure Cosmos DB 连接器选择适当的数据库和容器。Connect Azure Cosmos account to Power BI - Open the Power BI Desktop and use the Azure Cosmos DB connector to select the right database and container.

    Azure Cosmos DB Power BI 连接器

  2. 配置增量刷新 - 遵循使用 Power BI 进行增量刷新一文中的步骤,为数据集配置增量刷新。Configure incremental refresh - Follow the steps in incremental refresh with Power BI article to configure incremental refresh for the dataset. 按以下屏幕截图中所示添加 RangeStartRangeEnd 参数:Add the RangeStart and RangeEnd parameters as shown in the following screenshot:

    配置范围参数

    由于数据集包含文本格式的 Date 列,因此应转换 RangeStartRangeEnd 参数,以使用以下筛选器。Since the dataset has a Date column that is in text form, the RangeStart and RangeEnd parameters should be transformed to use the following filter. 在“高级编辑器”窗格中修改查询,并添加以下文本以基于 RangeStart 和 RangeEnd 参数筛选行: In the Advanced Editor pane, modify your query add the following text to filter the rows based on the RangeStart and RangeEnd parameters:

    #"Filtered Rows" = Table.SelectRows(#"Expanded Document", each [Document.date] > DateTime.ToText(RangeStart,"yyyy-MM-dd") and [Document.date] < DateTime.ToText(RangeEnd,"yyyy-MM-dd"))
    

    根据源数据集中存在的列和数据类型,可以相应地更改 RangeStart 和 RangeEnd 字段Depending on which column and data type is present in the source dataset, you can change the RangeStart and RangeEnd fields accordingly

    属性Property 数据类型Data type 筛选器Filter
    _ts_ts NumericNumeric [_ts] > Duration.TotalSeconds(RangeStart - #datetime(1970, 1, 1, 0, 0, 0)) and [_ts] < Duration.TotalSeconds(RangeEnd - #datetime(1970, 1, 1, 0, 0, 0)))[_ts] > Duration.TotalSeconds(RangeStart - #datetime(1970, 1, 1, 0, 0, 0)) and [_ts] < Duration.TotalSeconds(RangeEnd - #datetime(1970, 1, 1, 0, 0, 0)))
    Date(例如:- 2019-08-19)Date (for example:- 2019-08-19) StringString [Document.date]> DateTime.ToText(RangeStart,"yyyy-MM-dd") and [Document.date] < DateTime.ToText(RangeEnd,"yyyy-MM-dd")[Document.date]> DateTime.ToText(RangeStart,"yyyy-MM-dd") and [Document.date] < DateTime.ToText(RangeEnd,"yyyy-MM-dd")
    Date(例如:- 2019-08-11 12:00:00)Date (for example:- 2019-08-11 12:00:00) StringString [Document.date]> DateTime.ToText(RangeStart," yyyy-mm-dd HH:mm:ss") and [Document.date] < DateTime.ToText(RangeEnd,"yyyy-mm-dd HH:mm:ss")[Document.date]> DateTime.ToText(RangeStart," yyyy-mm-dd HH:mm:ss") and [Document.date] < DateTime.ToText(RangeEnd,"yyyy-mm-dd HH:mm:ss")
  3. 定义刷新策略 - 导航到表的上下文菜单中的“增量刷新”选项卡来定义刷新策略。 Define the refresh policy - Define the refresh policy by navigating to the Incremental refresh tab on the context menu for the table. 将刷新策略设置为每日刷新并存储上个月的数据。Set the refresh policy to refresh every day and store the last month data.

    定义刷新策略

    忽略警告“无法确认是否要折叠 M 查询”。 Ignore the warning that says the M query cannot be confirmed to be folded. Azure Cosmos DB 连接器将折叠筛选器查询。The Azure Cosmos DB connector folds filter queries.

  4. 加载数据并生成报表 - 使用前面加载的数据,创建用于报告气温和降水量的图表。Load the data and generate the reports - By using the data you have loaded earlier, create the charts to report on temperature and rainfall.

    加载数据并生成报表

  5. 将报表发布到 Power BI Premium - 由于增量刷新是仅在 Premium 版本中提供的功能,发布对话框仅允许选择 Premium 容量中的工作区。Publish the report to Power BI premium - Since incremental refresh is a Premium only feature, the publish dialog only allows selection of a workspace on Premium capacity. 由于需要导入历史数据,首次刷新所需的时间可能较长。The first refresh may take longer to import the historical data. 后续的数据刷新速度要快得多,因为使用了增量刷新。Subsequent data refreshes are much quicker because they use incremental refresh.

Power BI Azure Analysis 连接器 + Azure Analysis ServicesPower BI Azure Analysis connector + Azure Analysis Services

将天气数据引入 Azure Cosmos DBIngest weather data into Azure Cosmos DB

设置一个引入管道,以将天气数据加载到 Azure Cosmos DB 中。Set up an ingestion pipeline to load weather data to Azure Cosmos DB. 可以设置一个 Azure 数据工厂 (ADF) 作业,以使用 HTTP 源和 Cosmos DB 接收器定期将最新的天气数据载入 Azure Cosmos DB。You can set up an Azure Data Factory(ADF) job to periodically load the latest weather data into Azure Cosmos DB using the HTTP Source and Cosmos DB Sink.

将 Azure Analysis Services 连接到 Azure Cosmos 帐户Connect Azure Analysis Services to Azure Cosmos account

  1. 创建新的 Azure Analysis Services 群集 - 在 Azure Cosmos 帐户和 Databricks 群集所在的同一区域中创建 Azure Analysis Services 的实例Create a new Azure Analysis Services cluster - Create an instance of Azure Analysis services in the same region as the Azure Cosmos account and the Databricks cluster.

  2. 在 Visual Studio 中创建新的 Analysis Services 表格项目 - 安装 SQL Server Data Tools (SSDT),并在 Visual Studio 中创建一个 Analysis Services 表格项目。Create a new Analysis Services Tabular Project in Visual Studio - Install the SQL Server Data Tools (SSDT) and create an Analysis Services Tabular project in Visual Studio.

    创建 Azure Analysis Services 项目

    选择“集成式工作区”实例,并将“兼容性级别”设置为“SQL Server 2017 / Azure Analysis Services (1400)” Choose the Integrated Workspace instance and the set the Compatibility Level to SQL Server 2017 / Azure Analysis Services (1400)

    Azure Analysis Services 表格模型设计器

  3. 添加 Azure Cosmos DB 数据源 - 导航到“模型”> “数据源” > “新建数据源”,并按以下屏幕截图中所示添加 Azure Cosmos DB 数据源: Add the Azure Cosmos DB data source - Navigate to Models> Data Sources > New Data Source and add the Azure Cosmos DB data source as shown in the following screenshot:

    添加 Cosmos DB 数据源

    提供帐户 URI数据库名称容器名称以连接到 Azure Cosmos DB。Connect to Azure Cosmos DB by providing the account URI, database name, and the container name. 现在可以看到,Azure Cosmos 容器中的数据已导入到 Power BI 中。You can now see the data from Azure Cosmos container is imported into Power BI.

    预览 Azure Cosmos DB 数据

  4. 构建 Analysis Services 模型 - 打开查询编辑器,执行所需的操作以优化加载的数据集:Construct the Analysis Services model - Open the query editor, perform the required operations to optimize the loaded data set:

    • 仅提取天气相关的列(气温和降水量)Extract only the weather-related columns (temperature and rainfall)

    • 从表中提取月份信息。Extract the month information from the table. 根据下一部分所述创建分区时,此数据非常有用。This data is useful in creating partitions as described in the next section.

    • 将气温列转换为数字Convert the temperature columns to number

    生成的 M 表达式如下所示:The resulting M expression is as follows:

    let
        Source=#"DocumentDB/https://[ACCOUNTNAME].documents.azure.cn:443/",
        #"Expanded Document" = Table.ExpandRecordColumn(Source, "Document", {"id", "_rid", "_self", "_etag", "fogground", "snowfall", "dust", "snowdepth", "mist", "drizzle", "hail", "fastest2minwindspeed", "thunder", "glaze", "snow", "ice", "fog", "temperaturemin", "fastest5secwindspeed", "freezingfog", "temperaturemax", "blowingsnow", "freezingrain", "rain", "highwind", "date", "precipitation", "fogheavy", "smokehaze", "avgwindspeed", "fastest2minwinddir", "fastest5secwinddir", "_attachments", "_ts"}, {"Document.id", "Document._rid", "Document._self", "Document._etag", "Document.fogground", "Document.snowfall", "Document.dust", "Document.snowdepth", "Document.mist", "Document.drizzle", "Document.hail", "Document.fastest2minwindspeed", "Document.thunder", "Document.glaze", "Document.snow", "Document.ice", "Document.fog", "Document.temperaturemin", "Document.fastest5secwindspeed", "Document.freezingfog", "Document.temperaturemax", "Document.blowingsnow", "Document.freezingrain", "Document.rain", "Document.highwind", "Document.date", "Document.precipitation", "Document.fogheavy", "Document.smokehaze", "Document.avgwindspeed", "Document.fastest2minwinddir", "Document.fastest5secwinddir", "Document._attachments", "Document._ts"}),
        #"Select Columns" = Table.SelectColumns(#"Expanded Document",{"Document.temperaturemin", "Document.temperaturemax", "Document.rain", "Document.date"}),
        #"Duplicated Column" = Table.DuplicateColumn(#"Select Columns", "Document.date", "Document.month"),
        #"Extracted First Characters" = Table.TransformColumns(#"Duplicated Column", {{"Document.month", each Text.Start(_, 7), type text}}),
        #"Sorted Rows" = Table.Sort(#"Extracted First Characters",{{"Document.date", Order.Ascending}}),
        #"Changed Type" = Table.TransformColumnTypes(#"Sorted Rows",{{"Document.temperaturemin", type number}, {"Document.temperaturemax", type number}}),
        #"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Document.month] = "2019-07")
    in
        #"Filtered Rows"
    

    此外,将气温列的数据类型更改为“Decimal”,以确保可以在 Power BI 中绘出这些值。Additionally, change the data type of the temperature columns to Decimal to make sure that these values can be plotted in Power BI.

  5. 创建 Azure Analysis 分区 - 在 Azure Analysis Services 中创建分区,以将数据集划分到可按不同的频率独立刷新的逻辑分区。Create Azure Analysis partitions - Create partitions in Azure Analysis Services to divide the dataset into logical partitions that can be refreshed independently and at different frequencies. 此示例将创建两个分区,以将数据集划分为最近月份的数据和任何其他内容。In this example, you create two partitions that would divide the dataset into the most recent month's data and everything else.

    创建 Analysis Services 分区

    在 Azure Analysis Services 中创建以下两个分区:Create the following two partitions in Azure Analysis Services:

    • 最近月份 - #"Filtered Rows" = Table.SelectRows(#"Sorted Rows", each [Document.month] = "2019-07")Latest Month - #"Filtered Rows" = Table.SelectRows(#"Sorted Rows", each [Document.month] = "2019-07")
    • 历史 - #"Filtered Rows" = Table.SelectRows(#"Sorted Rows", each [Document.month] <> "2019-07")Historical - #"Filtered Rows" = Table.SelectRows(#"Sorted Rows", each [Document.month] <> "2019-07")
  6. 将模型部署到 Azure Analysis Server - 右键单击 Azure Analysis Services 项目并选择“部署”。 Deploy the Model to the Azure Analysis Server - Right click on the Azure Analysis Services project and choose Deploy. 在“部署服务器属性”窗格中添加服务器名称。 Add the server name in the Deployment Server properties pane.

    部署 Azure Analysis Services 模型

  7. 配置分区刷新与合并 - Azure Analysis Services 允许独立处理分区。Configure partition refreshes and merges - Azure Analysis Services allows independent processing of partitions. 由于我们希望使用最新的数据持续更新“最近月份”分区,因此请将刷新间隔设置为 5 分钟。 Since we want the Latest Month partition to be constantly updated with the most recent data, set the refresh interval to 5 minutes. 可以使用 REST API逻辑应用刷新数据。You can refresh the data by using the REST API, or with a Logic App. 不需要刷新历史分区中的数据。It's not required to refresh the data in historical partition. 此外,还需要编写一些代码将“最近月份”分区合并到历史分区,并创建一个新的“最近月份”分区。Additionally, you need to write some code to consolidate the latest month partition to the historical partition and create a new latest month partition.

将 Power BI 连接到 Analysis ServicesConnect Power BI to Analysis Services

  1. 使用 Azure Analysis Services 数据库连接器连接到 Azure Analysis Server - 选择“实时模式”并连接到 Azure Analysis Services 实例,如以下屏幕截图中所示:Connect to the Azure Analysis Server using the Azure Analysis Services database Connector - Choose the Live mode and connect to the Azure Analysis Services instance as shown in the following screenshot:

    从 Azure Analysis Services 中获取数据

  2. 加载数据并生成报表 - 使用前面加载的数据,创建用于报告气温和降水量的图表。Load the data and generate reports - By using the data you have loaded earlier, create charts to report on temperature and rainfall. 由于创建的是实时连接,因此应该针对在上一步骤中部署的 Azure Analysis Services 模型中的数据执行查询。Since you are creating a live connection, the queries should be executed on the data in the Azure Analysis Services model that you have deployed in the previous step. 在将新数据载入 Azure Cosmos DB 后的五分钟内,气温图表将会更新。The temperature charts will be updated within five minutes after the new data is loaded into Azure Cosmos DB.

    加载数据并生成报表

后续步骤Next steps