使用群集、池和工作区标记监视使用情况Monitor usage using cluster, pool, and workspace tags

要监视成本并将 Azure Databricks 使用情况准确地划分到组织的业务部门和团队(例如退款),可以标记工作区(资源组)、群集和池。To monitor cost and accurately attribute Azure Databricks usage to your organization’s business units and teams (for chargebacks, for example), you can tag workspaces (resource groups), clusters, and pools. 这些标记传播到详细的成本分析报表,你可以在 Azure 门户中访问这些报表。These tags propagate to detailed cost analysis reports that you can access in the Azure portal.

例如,以下是 Azure 门户中的成本分析发票详细信息报表,该报表通过 clusterid 标记详细记录了一个月的成本:For example, here is a cost analysis invoice details report in the Azure portal that details cost by clusterid tag over a one-month period:

按群集 ID 进行成本分析Cost analysis by cluster ID

标记的对象和资源Tagged objects and resources

你可以为 Azure Databricks 托管的下列对象添加自定义标记:You can add custom tags for the following objects managed by Azure Databricks:

ObjectObject 标记界面 (UI)Tagging interface (UI) 标记界面 (API)Tagging interface (API)
工作区Workspace Azure 门户Azure Portal Azure 资源 APIAzure Resources API
Pool Azure Databricks 工作区中的池 UIPool UI in the Azure Databricks workspace 实例池 APIInstance Pool API
群集Cluster Azure Databricks 工作区中的群集 UICluster UI in the Azure Databricks workspace 群集 APIClusters API

Azure Databricks 将以下默认标记添加到所有池和群集中:Azure Databricks adds the following default tags to all pools and clusters:

池标记密钥名称Pool tag key name “值”Value
Vendor 常数“Databricks”Constant “Databricks”
DatabricksInstancePoolCreatorId 创建池的用户的 Azure Databricks 内部标识符Azure Databricks internal identifier of the user who created the pool
DatabricksInstancePoolId 池的 Azure Databricks 内部标识符Azure Databricks internal identifier of the pool
群集标记密钥名称Cluster tag key name “值”Value
Vendor 常数“Databricks”Constant “Databricks”
ClusterId 群集的 Azure Databricks 内部标识符Azure Databricks internal identifier of the cluster
ClusterName 群集的名称Name of the cluster
Creator 创建群集的用户的用户名(电子邮件地址)Username (email address) of the user who created the cluster

在作业群集上,Azure Databricks 还应用以下默认标记:On job clusters, Azure Databricks also applies the following default tags:

群集标记密钥名称Cluster tag key name Value
RunName 作业名称Job name
JobId 作业 IDJob ID

标记传播Tag propagation

工作区、池和群集标记由 Azure Databricks 聚合并传播到 Azure VM,用于成本分析报表Workspace, pool, and cluster tags are aggregated by Azure Databricks and propagated to Azure VMs for cost analysis reporting. 但池和群集标记的传播方式彼此不同。But pool and cluster tags are propagated differently from each other.

按群集 ID 进行成本分析Cost analysis by cluster ID

工作区和池标记进行聚合并分配为托管池的 Azure VM 的资源标记。Workspace and pool tags are aggregated and assigned as resource tags of the Azure VMs that host the pools.

工作区和群集标记进行聚合并分配为托管群集的 Azure VM 的资源标记。Workspace and cluster tags are aggregated and assigned as resource tags of the Azure VMs that host the clusters.

从池中创建群集时,只会将工作区标记和池标记传播到 VM。When clusters are created from pools, only workspace tags and pool tags are propagated to the VMs. 不传播群集标记,以保持池群集启动性能。Cluster tags are not propagated, in order to preserve pool cluster startup performance.

标记冲突解决Tag conflict resolution

如果自定义群集标记、池标记或工作区标记与 Azure Databricks 默认群集或池标记具有相同的名称,则该自定义标记在传播时将以 x_ 作为前缀。If a custom cluster tag, pool tag, or workspace tag has the same name as a Azure Databricks default cluster or pool tag, the custom tag is prefixed with an x_ when it is propagated.

例如,如果工作区标记有 vendor = Azure Databricks,则该标记将与默认的群集标记 vendor = Databricks 冲突。For example, if a workspace is tagged with vendor = Azure Databricks, that tag will conflict with the default cluster tag vendor = Databricks. 因此,标记将作为 x_vendor = Azure Databricksvendor = Databricks 传播。The tags will therefore be propagated as x_vendor = Azure Databricks and vendor = Databricks.

限制Limitations

  • 在进行任何更改后,自定义工作区标记传播到 Azure Databricks 可能需要长达一个小时的时间。It can take up to one hour for custom workspace tags to propagate to Azure Databricks after any change.
  • 不能为 Azure 资源分配超过 50 个标记。No more than 50 tags can be assigned to an Azure resource. 如果聚合标记的总计数超过此限制,带 x_ 前缀的标记将按字母顺序计算,超出限制的标记将被忽略。If the overall count of aggregated tags exceeds this limit, x_-prefixed tags are evaluated in alphabetical order and those that exceed the limit are ignored. 如果忽略所有带 x_ 前缀的标记,并且一直计数直到超过限制,则剩余的标记将按照字母顺序计算,而超出限制的标记将被忽略。If all x_-prefixed tags are ignored and the count is till over the limit, the remaining tags are evaluated in alphabetical order and those that exceed the limit are ignored.
  • 标记键和值只能包含来自 ISO 8859-1 (latin1) 集的字符。Tag keys and values can contain only characters from the ISO 8859-1 (latin1) set. 包含其他字符的标记将被忽略。Tags containing other characters are ignored.
  • 如果更改标记键名称或值,则这些更改仅在群集重启或池扩展之后才适用。If you change tag key names or values, these changes apply only after cluster restart or pool expansion.