Azure Analysis Services 横向扩展Azure Analysis Services scale-out

通过横向扩展,客户端查询可分布在查询池的多个查询副本中,从而在高查询工作负载期间缩短响应时间 。With scale-out, client queries can be distributed among multiple query replicas in a query pool, reducing response times during high query workloads. 也可将处理与查询池分开,确保客户端查询不受处理操作的负面影响。You can also separate processing from the query pool, ensuring client queries are not adversely affected by processing operations. 可在 Azure 门户中配置横向扩展,也可使用 Analysis Services REST API 进行配置。Scale-out can be configured in Azure portal or by using the Analysis Services REST API.

标准定价层中的服务器可使用横向扩展。Scale-out is available for servers in the Standard pricing tier. 每个查询副本与服务器的计费费率相同。Each query replica is billed at the same rate as your server. 将在服务器所在的同一区域中创建所有查询副本。All query replicas are created in the same region as your server. 你可以配置的查询副本数量受服务器所在区域限制。The number of query replicas you can configure are limited by the region your server is in. 若要了解详细信息,请参阅可用性(按区域)To learn more, see Availability by region. 横向扩展不会增加服务器的可用内存量。Scale-out does not increase the amount of available memory for your server. 要增加内存,需升级计划。To increase memory, you need to upgrade your plan.

为何要横向扩展?Why scale-out?

在典型的服务器部署中,一台服务器同时用作处理服务器和查询服务器。In a typical server deployment, one server serves as both processing server and query server. 如果服务器上针对模型的客户端查询数量超过服务器计划的查询处理单元 (QPU),或模型处理与高查询工作负载同时发生,就会导致性能降低。If the number of client queries against models on your server exceeds the Query Processing Units (QPU) for your server's plan, or model processing occurs at the same time as high query workloads, performance can decrease.

通过横向扩展,可以创建最多包含一个附加查询副本资源的查询池(总共两个资源,包括主服务器) 。With scale-out, you can create a query pool with up to one additional query replica resources (two total, including your primary server). 可以缩放查询池中的副本数目以满足关键时刻的 QPU 需求,并且随时可以从查询池中隔离处理服务器。You can scale the number of replicas in the query pool to meet QPU demands at critical times, and you can separate a processing server from the query pool at any time.

不论查询池中查询副本的数量如何,处理工作负载都不会分布在查询副本中。Regardless of the number of query replicas you have in a query pool, processing workloads are not distributed among query replicas. 主服务器充当处理服务器。The primary server serves as the processing server. 查询副本仅为针对查询池中主服务器与每个副本之间同步的模型数据库发出的查询提供服务。Query replicas serve only queries against the model databases synchronized between the primary server and each replica in the query pool.

横向扩展时,最长可能需要花费五分钟时间来以增量方式将新的查询副本添加到查询池。When scaling out, it can take up to five minutes for new query replicas to be incrementally added to the query pool. 所有新的查询副本正常运行后,新的客户端连接将在查询池中的资源之间进行负载均衡。When all new query replicas are up and running, new client connections are load balanced across resources in the query pool. 现有的客户端连接不会从当前连接到的资源更改。Existing client connections are not changed from the resource they are currently connected to. 向内扩展时,将终止与正在从查询池中删除的查询池资源的任何现有客户端连接。When scaling in, any existing client connections to a query pool resource that is being removed from the query pool are terminated. 客户端可以重新连接到剩余的查询池资源。Clients can reconnect to a remaining query pool resource.

工作原理How it works

首次配置横向扩展时,主服务器上的模型数据库将自动与新查询池中的新副本同步。 When configuring scale-out the first time, model databases on your primary server are automatically synchronized with new replicas in a new query pool. 自动同步只发生一次。Automatic synchronization occurs only once. 在自动同步期间,主服务器的数据文件(在 Blob 存储中静态加密)将复制到另一个位置,并且也会在 Blob 存储中静态加密。During automatic synchronization, the primary server's data files (encrypted at rest in blob storage) are copied to a second location, also encrypted at rest in blob storage. 然后,查询池中的副本将与第二组文件中的数据合成。 Replicas in the query pool are then hydrated with data from the second set of files.

只能在首次横向扩展服务器时执行自动同步,不过,也可以执行手动同步。While an automatic synchronization is performed only when you scale-out a server for the first time, you can also perform a manual synchronization. 同步可确保查询池中的副本数据与主服务器的数据相匹配。Synchronizing assures data on replicas in the query pool match that of the primary server. 处理(刷新)主服务器上的模型时,必须在处理操作完成之后执行同步。 When processing (refresh) models on the primary server, a synchronization must be performed after processing operations are completed. 这种同步会将更新的数据从 Blob 存储中的主服务器文件复制到第二组文件。This synchronization copies updated data from the primary server's files in blob storage to the second set of files. 然后,查询池中的副本将与 Blob 存储中第二组文件内的已更新数据合成。Replicas in the query pool are then hydrated with updated data from the second set of files in blob storage.

执行后续的横向扩展操作时(例如,将查询池中的副本数从两个增加到五个),新副本将与 Blob 存储中第二组文件内的数据合成。When performing a subsequent scale-out operation, for example, increasing the number of replicas in the query pool from two to five, the new replicas are hydrated with data from the second set of files in blob storage. 不会发生同步。There is no synchronization. 如果在横向扩展后执行同步,则查询池中的新副本将合成两次 - 多余的合成。If you were to then perform a synchronization after scaling out, the new replicas in the query pool would be hydrated twice - a redundant hydration. 执行后续的横向扩展操作时,请务必记住:When performing a subsequent scale-out operation, it's important to keep in mind:

  • 先执行同步,再执行横向扩展操作,以免多余地合成添加的副本。 Perform a synchronization before the scale-out operation to avoid redundant hydration of the added replicas. 不允许同时运行同步和横向扩展操作。Concurrent synchronization and scale-out operations running at the same time are not allowed.

  • 将处理操作和横向扩展操作自动化时,必须先处理主服务器上的数据,再执行同步,然后执行横向扩展操作。 When automating both processing and scale-out operations, it's important to first process data on the primary server, then perform a synchronization, and then perform the scale-out operation. 遵循此顺序可确保尽量减轻对 QPU 和内存资源造成的影响。This sequence assures minimal impact on QPU and memory resources.

  • 即使查询池中没有副本,也允许同步。Synchronization is allowed even when there are no replicas in the query pool. 如果在主服务器上通过处理操作将包含新数据的副本数从零个横向扩展为一个或多个,请先在查询池中不包含任何副本的情况下执行同步,然后再横向扩展。在横向扩展之前执行同步可以避免多余地合成新添加的副本。If you are scaling out from zero to one or more replicas with new data from a processing operation on the primary server, perform the synchronization first with no replicas in the query pool, and then scale-out. Synchronizing before scaling out avoids redundant hydration of the newly added replicas.

  • 从主服务器中删除模型数据库时,不会自动从查询池中的副本内删除该数据库。When deleting a model database from the primary server, it does not automatically get deleted from replicas in the query pool. 必须使用 Sync-AzAnalysisServicesInstance PowerShell 命令执行同步操作。该命令会从副本的共享 Blob 存储位置删除该数据库的文件,然后删除查询池中的副本内的模型数据库。You must perform a synchronization operation by using the Sync-AzAnalysisServicesInstance PowerShell command that removes the file/s for that database from the replica's shared blob storage location and then deletes the model database on the replicas in the query pool. 若要确定某个模型数据库是否存在于查询池中的副本上,但不存在于主服务器上,请确保“从查询池分离处理服务器”设置为“是”。 To determine if a model database exists on replicas in the query pool but not on the primary server, ensure the Separate the processing server from querying pool setting is to Yes. 然后,通过 SSMS 使用 :rw 限定符连接到主服务器,看该数据库是否存在。Then use SSMS to connect to the primary server using the :rw qualifier to see if the database exists. 然后,在没有 :rw 限定符的情况下连接到查询池中的副本,看同一数据库是否还存在。Then connect to replicas in the query pool by connecting without the :rw qualifier to see if the same database also exists. 如果该数据库存在于查询池中的副本上,但不存在于主服务器上,请运行同步操作。If the database exists on replicas in the query pool but not on the primary server, run a sync operation.

  • 重命名主服务器上的数据库时,需要执行一个额外的步骤来确保数据库正确同步到所有副本。When renaming a database on the primary server, there's an additional step necessary to ensure the database is properly synchronized to any replicas. 重命名后,使用 Sync-AzAnalysisServicesInstance 并使用旧数据库名称指定 -Database 参数,来执行同步。After renaming, perform a synchronization by using the Sync-AzAnalysisServicesInstance command specifying the -Database parameter with the old database name. 这种同步会从所有副本中删除使用旧名称的数据库和文件。This synchronization removes the database and files with the old name from any replicas. 然后,使用新数据库名称指定 -Database 参数,来执行另一次同步。Then perform another synchronization specifying the -Database parameter with the new database name. 第二次同步会将新命名的数据库复制到第二组文件,并合成所有副本。The second synchronization copies the newly named database to the second set of files and hydrates any replicas. 无法在门户中使用“同步模型”命令执行这些同步。These synchronizations cannot be performed by using the Synchronize model command in the portal.

从查询池分离处理操作Separate processing from query pool

为了最大限度地提高处理和查询操作的性能,可以选择将处理服务器与查询池分开。For maximum performance for both processing and query operations, you can choose to separate your processing server from the query pool. 分离后,新的客户端连接仅会分配给查询池中的查询副本。When separated, new client connections are assigned to query replicas in the query pool only. 如果处理操作仅占用一小段时间,则可以选择将处理服务器与查询池分开仅执行处理和同步操作所需的时间,然后将其包含回查询池中。If processing operations only take up a short amount of time, you can choose to separate your processing server from the query pool only for the amount of time it takes to perform processing and synchronization operations, and then include it back into the query pool. 将处理服务器从查询池分离时,或者将其添加回查询池中时,可能需要长达五分钟的时间来完成操作。When separating the processing server from the query pool, or adding it back into the query pool can take up to five minutes for the operation to complete.

监视 QPU 使用情况Monitor QPU usage

若要确定服务器是否必须进行横向扩展,请在 Azure 门户中使用指标监视服务器。To determine if scale-out for your server is necessary, monitor your server in Azure portal by using Metrics. 如果 QPU 经常超过上限,则表示针对模型的查询数量超出了计划的 QPU 限制。If your QPU regularly maxes out, it means the number of queries against your models is exceeding the QPU limit for your plan. 查询线程池队列中的查询数量超过可用的 QPU 时,查询池作业队列长度指标也会增加。The Query pool job queue length metric also increases when the number of queries in the query thread pool queue exceeds available QPU.

可监视的另一个很好指标是按 ServerResourceType 列出的平均 QPU。Another good metric to watch is average QPU by ServerResourceType. 此指标将主服务器的平均 QPU 与查询池的平均 QPU 进行比较。This metric compares average QPU for the primary server with that of the query pool.

查询横向扩展指标

按 ServerResourceType 配置 QPUTo configure QPU by ServerResourceType

  1. 在“指标”折线图中,单击“添加指标”。 In a Metrics line chart, click Add metric.
  2. 在“资源”中选择你的服务器,在“指标命名空间”中选择“Analysis Services 标准指标”,在“指标”中选择“QPU”,然后在“聚合”中选择“平均”。 In RESOURCE, select your server, then in METRIC NAMESPACE, select Analysis Services standard metrics, then in METRIC, select QPU, and then in AGGREGATION, select Avg.
  3. 单击“应用拆分”。 Click Apply Splitting.
  4. 在“值”中选择“ServerResourceType”。 In VALUES, select ServerResourceType.

有关详细信息,请参阅监视服务器指标To learn more, see Monitor server metrics.

配置横向扩展Configure scale-out

在 Azure 门户中配置In Azure portal

  1. 在门户中,单击“横向扩展” 。使用滑块选择查询副本服务器的数量。In the portal, click Scale-out. Use the slider to select the number of query replica servers. 选择的副本数量不包括现有的服务器。The number of replicas you choose is in addition to your existing server.

  2. 在“从查询池分离处理服务器”中,选择“是”以将处理服务器和查询服务器分开 。In Separate the processing server from the querying pool, select yes to exclude your processing server from query servers. 使用默认连接字符串(不带 :rw)的客户端连接将重定向到查询池中的副本。Client connections using the default connection string (without :rw) are redirected to replicas in the query pool.

    横向扩展滑块

  3. 单击“保存”以预配新的查询副本服务器 。Click Save to provision your new query replica servers.

首次配置服务器的横向扩展时,主服务器上的模型将自动与查询池中的副本同步。When configuring scale-out for a server the first time, models on your primary server are automatically synchronized with replicas in the query pool. 自动同步只发生一次,即,首次配置横向扩展为一个或多个副本时。Automatic synchronization only occurs once, when you first configure scale-out to one or more replicas. 以后对同一台服务器上的副本数进行更改不会再次触发自动同步。 Subsequent changes to the number of replicas on the same server will not trigger another automatic synchronization. 即使将服务器设置为零个副本,然后再次横向扩展为任意数目的副本,也不会再次发生自动同步。Automatic synchronization will not occur again even if you set the server to zero replicas and then again scale-out to any number of replicas.

同步Synchronize

必须手动或使用 REST API 执行同步操作。Synchronization operations must be performed manually or by using the REST API.

在 Azure 门户中配置In Azure portal

在“概述”>“模型”>“同步模型”中操作 。In Overview > model > Synchronize model.

横向扩展滑块

REST APIREST API

使用同步操作。Use the sync operation.

同步模型Synchronize a model

POST https://<region>.asazure.chinacloudapi.cn/servers/<servername>:rw/models/<modelname>/sync

获取同步状态Get sync status

GET https://<region>.asazure.chinacloudapi.cn/servers/<servername>/models/<modelname>/sync

返回状态代码:Return status codes:

代码Code 说明Description
-1-1 无效Invalid
00 正在复制Replicating
11 正在解冻Rehydrating
22 已完成Completed
33 已失败Failed
44 正在完成Finalizing

PowerShellPowerShell

Note

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

在使用 PowerShell 之前,请安装或更新最新的 Azure PowerShell 模块Before using PowerShell, install or update the latest Azure PowerShell module.

若要运行同步,请使用 Sync-AzAnalysisServicesInstanceTo run sync, use Sync-AzAnalysisServicesInstance.

若要设置查询副本数,请使用 Set-AzAnalysisServicesServerTo set the number of query replicas, use Set-AzAnalysisServicesServer. 指定可选的 -ReadonlyReplicaCount 参数。Specify the optional -ReadonlyReplicaCount parameter.

若要从查询池隔离处理服务器,请使用 Set-AzAnalysisServicesServerTo separate the processing server from the query pool, use Set-AzAnalysisServicesServer. -DefaultConnectionMode 可选参数指定为使用 ReadonlySpecify the optional -DefaultConnectionMode parameter to use Readonly.

若要了解详细信息,请参阅将服务主体与 Az.AnalysisServices 模块配合使用To learn more, see Using a service principal with the Az.AnalysisServices module.

连接Connections

服务器的“概述”页上有两个服务器名称。On your server's Overview page, there are two server names. 如果尚未对服务器配置横向扩展,则这两个服务器名称的工作方式相同。If you haven't yet configured scale-out for a server, both server names work the same. 为服务器配置横向扩展之后,需要根据连接类型指定适当的服务器名称。Once you configure scale-out for a server, you need to specify the appropriate server name depending on the connection type.

对于最终用户客户端连接(如 Power BI Desktop、Excel 和自定义应用),请使用“服务器名称” 。For end-user client connections like Power BI Desktop, Excel, and custom apps, use Server name.

对于 PowerShell 中的 SSMS、SSDT 和连接字符串、Azure 函数应用以及 AMO,请使用“管理服务器名称” 。For SSMS, SSDT, and connection strings in PowerShell, Azure Function apps, and AMO, use Management server name. 管理服务器名称包含特殊限定符 :rw(读取-写入)。The management server name includes a special :rw (read-write) qualifier. 所有处理操作均在管理服务器(主服务器)上发生。All processing operations occur on the (primary) management server.

服务器名称

故障排除Troubleshoot

问题: 用户收到错误“在连接模式 "ReadOnly" 下找不到服务器“<服务器名称>”实例” 。Issue: Users get error Cannot find server '<Name of the server>' instance in connection mode 'ReadOnly'.

解决方案: 选择“从查询池隔离处理服务器”选项时,使用默认连接字符串(不带 :rw)的客户端连接将重定向到查询池副本。 Solution: When selecting the Separate the processing server from the querying pool option, client connections using the default connection string (without :rw) are redirected to query pool replicas. 如果查询池中的副本因尚未完成同步而尚未联机,则重定向的客户端连接可能会失败。If replicas in the query pool are not yet online because synchronization has not yet been completed, redirected client connections can fail. 若要防止连接失败,执行同步时查询池中必须至少有两个服务器。To prevent failed connections, there must be at least two servers in the query pool when performing a synchronization. 每个服务器单独同步,而其他服务器保持联机。Each server is synchronized individually while others remain online. 如果在处理期间选择在查询池中没有处理服务器,则可以选择将其从池中删除以进行处理,然后在处理完成后但在同步之前将其添加回池中。If you choose to not have the processing server in the query pool during processing, you can choose to remove it from the pool for processing, and then add it back into the pool after processing is complete, but prior to synchronization. 可以使用内存和 QPU 指标来监视同步状态。Use Memory and QPU metrics to monitor synchronization status.

监视服务器指标 Monitor server metrics
管理 Azure Analysis ServicesManage Azure Analysis Services