如何审核 Azure Cosmos DB 控制平面操作How to audit Azure Cosmos DB control plane operations

适用于: SQL API Cassandra API Gremlin API 表 API Azure Cosmos DB API for MongoDB

Azure Cosmos DB 中的控制平面是一项 RESTful 服务,可用于对 Azure Cosmos 帐户执行各种操作。Control Plane in Azure Cosmos DB is a RESTful service that enables you to perform a diverse set of operations on the Azure Cosmos account. 它向最终用户公开公共资源模型(例如数据库、帐户)和各种操作,以便对资源模型执行操作。It exposes a public resource model (for example: database, account) and various operations to the end users to perform actions on the resource model. 控制平面操作包括对 Azure Cosmos 帐户或容器的更改。The control plane operations include changes to the Azure Cosmos account or container. 例如,创建 Azure Cosmos 帐户、添加区域、更新吞吐量、区域故障转移、添加 VNet 等操作都属于控制平面操作。For example, operations such as create an Azure Cosmos account, add a region, update throughput, region failover, add a VNet etc. are some of the control plane operations. 本文介绍如何在 Azure Cosmos DB 中审核控制平面操作。This article explains how to audit the control plane operations in Azure Cosmos DB. 可以使用 Azure CLI、PowerShell 或 Azure 门户对 Azure Cosmos 帐户执行控制平面操作,而对于容器,请使用 Azure CLI 或 PowerShell。You can run the control plane operations on Azure Cosmos accounts by using Azure CLI, PowerShell or Azure portal, whereas for containers, use Azure CLI or PowerShell.

以下是部分示例场景,其中审核控制平面操作很有帮助:The following are some example scenarios where auditing control plane operations is helpful:

  • 你希望在 Azure Cosmos 帐户的防火墙规则发生更改时,收到警报。You want to get an alert when the firewall rules for your Azure Cosmos account are modified. 需要警报才能找到针对规则的未经授权的修改,这些规则可控制 Azure Cosmos 帐户的网络安全并采取快速措施。The alert is required to find unauthorized modifications to rules that govern the network security of your Azure Cosmos account and take quick action.

  • 你希望在从 Azure Cosmos 帐户添加或删除新区域时,收到警报。You want to get an alert if a new region is added or removed from your Azure Cosmos account. 添加或删除区域会影响计费和数据主权要求。Adding or removing regions has implications on billing and data sovereignty requirements. 此警报将有助于检测帐户中误添加或误删除的区域。This alert will help you detect an accidental addition or removal of region on your account.

  • 你希望在诊断日志内容发生更改时收到更多详细信息。You want to get more details from the diagnostic logs on what has changed. 例如,更改了 VNet。For example, a VNet was changed.

禁用基于密钥的元数据写入访问Disable key based metadata write access

在 Azure Cosmos DB 中审核控制平面操作之前,请在帐户中禁用基于密钥的元数据写入访问。Before you audit the control plane operations in Azure Cosmos DB, disable the key-based metadata write access on your account. 禁用基于密钥的元数据写入访问后,会阻止通过帐户密钥连接到 Azure Cosmos 帐户的客户端访问该帐户。When key based metadata write access is disabled, clients connecting to the Azure Cosmos account through account keys are prevented from accessing the account. 可以通过将 disableKeyBasedMetadataWriteAccess 属性设置为 true 来禁用写入访问。You can disable write access by setting the disableKeyBasedMetadataWriteAccess property to true. 设置此属性后,拥有适当的、基于角色的访问控制 (RBAC) 角色和凭据的用户即可对任一资源进行更改。After you set this property, changes to any resource can happen from a user with the proper Role-based access control(RBAC) role and credentials. 若要详细了解如何设置此属性,请参阅阻止从 SDK 进行更改一文。To learn more on how to set this property, see the Preventing changes from SDKs article.

启用 disableKeyBasedMetadataWriteAccess 后,如果基于 SDK 的客户端执行创建或更新操作,则会返回“不允许通过 Azure Cosmos DB 终结点对资源 ContainerNameorDatabaseName 执行'发布'操作”错误。After the disableKeyBasedMetadataWriteAccess is turned on, if the SDK based clients run create or update operations, an error "Operation 'POST' on resource 'ContainerNameorDatabaseName' is not allowed through Azure Cosmos DB endpoint is returned. 必须为帐户启用对此类操作的访问权限,或者通过 Azure 资源管理器、Azure CLI 或 Azure PowerShell 执行创建/更新操作。You have to turn on access to such operations for your account, or perform the create/update operations through Azure Resource Manager, Azure CLI or Azure PowerShell. 若要切换回去,请按照阻止来自 Cosmos SDK 的更改中所述,使用 Azure CLI 将 disableKeyBasedMetadataWriteAccess 设置为 false。To switch back, set the disableKeyBasedMetadataWriteAccess to false by using Azure CLI as described in the Preventing changes from Cosmos SDK article. 确保将 disableKeyBasedMetadataWriteAccess 的值更改为 false 而不是 true。Make sure to change the value of disableKeyBasedMetadataWriteAccess to false instead of true.

禁用元数据写入访问时,请注意以下几点:Consider the following points when turning off the metadata write access:

  • 进行评估,并确保应用程序不会使用 SDK 或帐户密钥发出更改上述资源(例如,创建集合、更新吞吐量等)的元数据调用。Evaluate and ensure that your applications do not make metadata calls that change the above resources (For example, create collection, update throughput, …) by using the SDK or account keys.

  • 目前,Azure 门户会对元数据操作使用帐户密钥,因此这些操作将被阻止。Currently, the Azure portal uses account keys for metadata operations and hence these operations will be blocked. 请使用 Azure CLI、SDK 或资源管理器模板部署来执行此类操作。Alternatively, use the Azure CLI, SDKs, or Resource Manager template deployments to perform such operations.

对控制平面操作启用诊断日志Enable diagnostic logs for control plane operations

可以使用 Azure 门户对控制平面操作启用诊断日志。You can enable diagnostic logs for control plane operations by using the Azure portal. 启用后,诊断日志会将操作记录为一对具有相关详细信息的开始和完成事件。After enabling, the diagnostic logs will record the operation as a pair of start and complete events with relevant details. 例如,RegionFailoverStart 和 RegionFailoverComplete 将完成区域故障转移事件 。For example, the RegionFailoverStart and RegionFailoverComplete will complete the region failover event.

使用以下步骤对控制平面操作启用日志记录:Use the following steps to enable logging on control plane operations:

  1. 登录到 Azure 门户,导航到你的 Azure Cosmos 帐户。Sign into Azure portal and navigate to your Azure Cosmos account.

  2. 打开“诊断设置”窗格,为要创建的日志提供名称。 Open the Diagnostic settings pane, provide a Name for the logs to create.

  3. 选择“ControlPlaneRequests”作为日志类型,然后选择“发送到 Log Analytics”选项。 Select ControlPlaneRequests for log type and select the Send to Log Analytics option.

还可以将日志存储在存储帐户中,或将其流式传输到事件中心。You can also store the logs in a storage account or stream to an event hub. 本文介绍如何将日志发送到 Log Analytics,然后查询这些日志。This article shows how to send logs to log analytics and then query them. 启用诊断日志后,要使诊断日志生效,需要几分钟的时间。After you enable, it takes a few minutes for the diagnostic logs to take effect. 可以跟踪在该时间点之后执行的所有控制平面操作。All the control plane operations performed after that point can be tracked. 以下屏幕截图显示如何启用控制平面日志:The following screenshot shows how to enable control plane logs:

启用控制平面请求日志记录

查看控制平面操作View the control plane operations

启用日志记录后,使用以下步骤跟踪针对特定帐户的操作:After you turn on logging, use the following steps to track down operations for a specific account:

  1. 登录到 Azure 门户Sign into Azure portal.

  2. 在左侧导航栏中打开“监视”选项卡,然后选择“日志”窗格。 Open the Monitor tab from the left-hand navigation and then select the Logs pane. 此时会打开一个 UI,可在其中轻松针对该特定帐户按范围运行查询。It opens a UI where you can easily run queries with that specific account in scope. 运行以下查询以查看控制平面日志:Run the following query to view control plane logs:

    AzureDiagnostics
    | where ResourceProvider=="MICROSOFT.DOCUMENTDB" and Category=="ControlPlaneRequests"
    | where TimeGenerated >= ago(1h)
    

以下屏幕截图捕获了更改 Azure Cosmos 帐户的一致性级别时的日志:The following screenshots capture logs when a consistency level is changed for an Azure Cosmos account:

添加 VNet 时的控制平面日志

以下屏幕截图捕获创建密钥空间或 Cassandra 帐户的表时以及更新吞吐量时的日志。The following screenshots capture logs when the keyspace or a table of a Cassandra account are created and when the throughput is updated. 分别记录用于数据库和容器上的创建及更新操作的控制平面日志,如以下屏幕截图所示:The control plane logs for create and update operations on the database and the container are logged separately as shown in the following screenshot:

更新吞吐量时的控制平面日志

识别与特定操作关联的标识Identify the identity associated to a specific operation

若要进一步进行调试,可以使用活动 ID 或者按照操作的时间戳,来识别“活动日志”中的特定操作。If you want to debug further, you can identify a specific operation in the Activity log by using the Activity ID or by the timestamp of the operation. 时间戳用于某些未显式传递活动 ID 的资源管理器客户端。Timestamp is used for some Resource Manager clients where the activity ID is not explicitly passed. “活动日志”提供有关用于启动操作的标识的详细信息。The Activity log gives details about the identity with which the operation was initiated. 以下屏幕截图显示如何使用活动 ID,以及如何在“活动日志”中查找与该 ID 关联的操作:The following screenshot shows how to use the activity ID and find the operations associated with it in the Activity log:

使用活动 ID 和查找操作

Azure Cosmos 帐户的控制平面操作Control plane operations for Azure Cosmos account

以下是帐户级别可用的控制平面操作。The following are the control plane operations available at the account level. 在帐户级别跟踪大多数操作。Most of the operations are tracked at account level. 这些操作可用作 Azure Monitor 中的指标:These operations are available as metrics in Azure monitor:

  • 已添加区域Region added
  • 已删除区域Region removed
  • 已删除帐户Account deleted
  • 已故障转移区域Region failed over
  • 创建的帐户Account created
  • 已删除虚拟网络Virtual network deleted
  • 已更新帐户网络设置Account network settings updated
  • 已更新帐户复制设置Account replication settings updated
  • 已更新帐户密钥Account keys updated
  • 已更新帐户备份设置Account backup settings updated
  • 已更新帐户诊断设置Account diagnostic settings updated

数据库或容器的控制平面操作Control plane operations for database or containers

以下是数据库和容器级别可用的控制平面操作。The following are the control plane operations available at the database and container level. 这些操作可用作 Azure Monitor 中的指标:These operations are available as metrics in Azure monitor:

  • 已创建 SQL 数据库SQL Database Created
  • 已更新 SQL 数据库SQL Database Updated
  • 已更新 SQL 数据库吞吐量SQL Database Throughput Updated
  • 已删除 SQL 数据库SQL Database Deleted
  • 已创建 SQL 容器SQL Container Created
  • 已更新 SQL 容器SQL Container Updated
  • 已更新 SQL 容器吞吐量SQL Container Throughput Updated
  • 已删除 SQL 容器SQL Container Deleted
  • 已创建 Cassandra 密钥空间Cassandra Keyspace Created
  • 已更新 Cassandra 密钥空间Cassandra Keyspace Updated
  • 已更新 Cassandra 密钥空间吞吐量Cassandra Keyspace Throughput Updated
  • 已删除 Cassandra 密钥空间Cassandra Keyspace Deleted
  • 已创建 Cassandra 表Cassandra Table Created
  • Cassandra 表已更新Cassandra Table Updated
  • Cassandra 表吞吐量已更新Cassandra Table Throughput Updated
  • Cassandra 表已删除Cassandra Table Deleted
  • 已创建 Gremlin 数据库Gremlin Database Created
  • 已更新 Gremlin 数据库Gremlin Database Updated
  • 已更新 Gremlin 数据库吞吐量Gremlin Database Throughput Updated
  • 已删除 Gremlin 数据库Gremlin Database Deleted
  • 已创建 Gremlin 图形Gremlin Graph Created
  • 已更新 Gremlin 图形Gremlin Graph Updated
  • 已更新 Gremlin 图形吞吐量Gremlin Graph Throughput Updated
  • 已删除 Gremlin 图形Gremlin Graph Deleted
  • 已创建 Mongo 数据库Mongo Database Created
  • 已更新 Mongo 数据库Mongo Database Updated
  • 已更新 Mongo 数据库吞吐量Mongo Database Throughput Updated
  • Mongo 数据库已删除Mongo Database Deleted
  • 已创建 Mongo 集合Mongo Collection Created
  • Mongo 集合已更新Mongo Collection Updated
  • 已更新 Mongo 集合吞吐量Mongo Collection Throughput Updated
  • 已删除 Mongo 集合Mongo Collection Deleted
  • 已创建 AzureTable 表AzureTable Table Created
  • 已更新 AzureTable 表AzureTable Table Updated
  • AzureTable 表吞吐量已更新AzureTable Table Throughput Updated
  • AzureTable 表已删除AzureTable Table Deleted

诊断日志操作Diagnostic log operations

以下是诊断日志中不同操作的操作名称:The following are the operation names in diagnostic logs for different operations:

  • RegionAddStart, RegionAddCompleteRegionAddStart, RegionAddComplete
  • RegionRemoveStart, RegionRemoveCompleteRegionRemoveStart, RegionRemoveComplete
  • AccountDeleteStart, AccountDeleteCompleteAccountDeleteStart, AccountDeleteComplete
  • RegionFailoverStart, RegionFailoverCompleteRegionFailoverStart, RegionFailoverComplete
  • AccountCreateStart, AccountCreateCompleteAccountCreateStart, AccountCreateComplete
  • AccountUpdateStart, AccountUpdateCompleteAccountUpdateStart, AccountUpdateComplete
  • VirtualNetworkDeleteStart, VirtualNetworkDeleteCompleteVirtualNetworkDeleteStart, VirtualNetworkDeleteComplete
  • DiagnosticLogUpdateStart, DiagnosticLogUpdateCompleteDiagnosticLogUpdateStart, DiagnosticLogUpdateComplete

对于特定于 API 的操作,采用以下格式命名:For API-specific operations, the operation is named with the following format:

  • ApiKind + ApiKindResourceType + OperationTypeApiKind + ApiKindResourceType + OperationType
  • ApiKind + ApiKindResourceType +“Throughput”+ operationTypeApiKind + ApiKindResourceType + "Throughput" + operationType

示例Example

  • CassandraKeyspacesCreateCassandraKeyspacesCreate
  • CassandraKeyspacesUpdateCassandraKeyspacesUpdate
  • CassandraKeyspacesThroughputUpdateCassandraKeyspacesThroughputUpdate
  • SqlContainersUpdateSqlContainersUpdate

ResourceDetails 属性包含整个资源主体作为请求有效负载,并且包含所有请求更新的属性The ResourceDetails property contains the entire resource body as a request payload and it contains all the properties requested to update

控制平面操作的诊断日志查询Diagnostic log queries for control plane operations

以下是获取控制平面操作的诊断日志的部分示例:The following are some examples to get diagnostic logs for control plane operations:

AzureDiagnostics 
| where Category startswith "ControlPlane"
| where OperationName contains "Update"
| project httpstatusCode_s, statusCode_s, OperationName, resourceDetails_s, activityId_g
AzureDiagnostics 
| where Category =="ControlPlaneRequests"
| where TimeGenerated >= todatetime('2020-05-14T17:37:09.563Z')
| project TimeGenerated, OperationName, apiKind_s, apiKindResourceType_s, operationType_s, resourceDetails_s
AzureDiagnostics 
| where Category =="ControlPlaneRequests"
| where  OperationName startswith "SqlContainersUpdate"
AzureDiagnostics 
| where Category =="ControlPlaneRequests"
| where  OperationName startswith "SqlContainersThroughputUpdate"

通过查询获取 activityId 和发起容器删除操作的调用方:Query to get the activityId and the caller who initiated the container delete operation:

(AzureDiagnostics
| where Category == "ControlPlaneRequests"
| where OperationName == "SqlContainersDelete"
| where TimeGenerated >= todatetime('9/3/2020, 5:30:29.300 PM')
| summarize by activityId_g )
| join (
AzureActivity
| parse HTTPRequest with * "clientRequestId\": \"" activityId_g "\"" * 
| summarize by Caller, HTTPRequest, activityId_g)
on activityId_g
| project Caller, activityId_g

通过查询获取索引或 ttl 更新。Query to get index or ttl updates. 然后,可将此查询的输出与之前的更新进行比较,查看索引或 ttl 的变化情况。You can then compare the output of this query with an earlier update to see the change in index or ttl.

AzureDiagnostics
| where Category =="ControlPlaneRequests"
| where  OperationName == "SqlContainersUpdate"
| project resourceDetails_s

输出:output:

{id:skewed,indexingPolicy:{automatic:true,indexingMode:consistent,includedPaths:[{path:/*,indexes:[]}],excludedPaths:[{path:/_etag/?}],compositeIndexes:[],spatialIndexes:[]},partitionKey:{paths:[/pk],kind:Hash},defaultTtl:1000000,uniqueKeyPolicy:{uniqueKeys:[]},conflictResolutionPolicy:{mode:LastWriterWins,conflictResolutionPath:/_ts,conflictResolutionProcedure:}

后续步骤Next steps