在 Azure 虚拟网络中部署 Azure Databricks(VNet 注入)Deploy Azure Databricks in your Azure virtual network (VNet injection)

Azure Databricks 的默认部署是 Azure 上的完全托管服务:所有数据平面资源(包括与所有群集关联的虚拟网络 (VNet))都部署到锁定的资源组。The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a virtual network (VNet) that all clusters will be associated with, are deployed to a locked resource group. 但如果需要自定义网络,则可以在自己的虚拟网络(有时称为 VNet 注入)中部署 Azure Databricks 数据平面资源,以便能够:If you require network customization, however, you can deploy Azure Databricks data plane resources in your own virtual network (sometimes called VNet injection), enabling you to:

通过将 Azure Databricks 数据平面资源部署到你自己的 VNet,还可以利用灵活的 CIDR 范围(VNet 的 CIDR 范围为 /16-/24,子网的 CIDR 范围最大为 /26)。Deploying Azure Databricks data plane resources to your own VNet also lets you take advantage of flexible CIDR ranges (anywhere between /16-/24 for the VNet and up to /26 for the subnets).

重要

不能替换现有工作区的虚拟网络。You cannot replace the virtual network for an existing workspace. 如果当前工作区无法容纳所需数量的活动群集节点,我们建议在较大的虚拟网络中另外创建一个工作区。If your current workspace cannot accommodate the required number of active cluster nodes, we recommend that you create another workspace in a larger virtual network. 按照这些详细的迁移步骤将资源(笔记本、群集配置、作业)从旧工作区复制到新工作区。Follow these detailed migration steps to copy resources (notebooks, cluster configurations, jobs) from the old to new workspace.

虚拟网络要求 Virtual network requirements

要将 Azure Databricks 工作区部署到的 VNet 必须满足以下要求:The VNet that you deploy your Azure Databricks workspace to must meet the following requirements:

  • 区域: 该 VNet 必须与 Azure Databricks 工作区位于同一区域。Region: The VNet must reside in the same region as the Azure Databricks workspace.

  • 订阅: 该 VNet 必须与 Azure Databricks 工作区属于同一订阅。Subscription: The VNet must be in the same subscription as the Azure Databricks workspace.

  • 地址空间: VNet 的 CIDR 块范围为 /16/24,两个子网(容器子网和主机子网)的 CIDR 块最大可以为 /26Address space: A CIDR block between /16 and /24 for the VNet and a CIDR block up to /26 for the two subnets: a container subnet and a host subnet. 若要查看有关基于 VNet 及其子网的大小的最大群集节点数的指导,请参阅地址空间和最大群集节点数For guidance about maximum cluster nodes based on the size of your VNet and its subnets, see Address space and maximum cluster nodes.

  • 子网: VNet 必须包括两个专用于 Azure Databricks 工作区的子网:一个容器子网(有时称为专用子网)和一个主机子网(有时称为公共子网)。Subnets: The VNet must include two subnets dedicated to your Azure Databricks workspace: a container subnet (sometimes called the private subnet) and a host subnet (sometimes called the public subnet). 但是,对于使用安全群集连接的工作区,容器子网和主机子网都是专用的。However, for a workspace that uses secure cluster connectivity, both the container subnet and host subnet are private. 不支持跨工作区共享子网或在 Azure Databricks 工作区使用的子网上部署其他 Azure 资源。It is unsupported to share subnets across workspaces or deploy other Azure resources on the subnets that are used by your Azure Databricks workspace. 若要查看有关基于 VNet 及其子网的大小的最大群集节点数的指导,请参阅地址空间和最大群集节点数For guidance about maximum cluster nodes based on the size of your VNet and its subnets, see Address space and maximum cluster nodes.

    重要

    这些子网和 Azure Databricks 工作区之间存在一对一关系。There is a one-to-one relationship between these subnets and an Azure Databricks workspace. 不能在单个子网中共享多个工作区。You cannot share multiple workspaces across a single subnet. 不支持跨工作区共享子网或在 Azure Databricks 工作区使用的子网上部署其他 Azure 资源。It is unsupported to share subnets across workspaces or to deploy other Azure resources on the subnets that are used by your Azure Databricks workspace.

若要详细了解用于配置 VNet 和部署工作区的模板,请参阅 Azure Databricks 提供的 Azure 资源管理器模板For more information about templates to configure your VNet and deploy your workspace, see Azure-Databricks-supplied Azure Resource Manager templates.

地址空间和最大群集节点数 Address space and maximum cluster nodes

具有较小虚拟网络的工作区比具有较大虚拟网络的工作区会更快地用完 IP 地址(网络空间)。A workspace with a smaller virtual network can run out of IP addresses (network space) more quickly than a workspace with a larger virtual network. 可使用的 VNet 的 CIDR 块范围为 /16/24,两个子网(容器子网和主机子网)的 CIDR 块最大可以为 /26Use a CIDR block between /16 and /24 for the VNet and a CIDR block up to /26 for the two subnets (the container subnet and the host subnet).

VNet 地址空间的 CIDR 范围会影响工作区可以使用的群集节点的最大数目:The CIDR range for your VNet address space affects the maximum number of cluster nodes that your workspace can use:

  • Azure Databricks 工作区需要 VNet 中的两个子网:容器子网(也称为专用子网)和主机子网(也称为公共子网)。An Azure Databricks workspace requires two subnets in the VNet: a container subnet (also known as private subnet) and a host subnet (also known as public subnet). 如果工作区使用安全群集连接,则容器子网和主机子网都是专用的。If the workspace uses secure cluster connectivity, both container and host subnets are private.
  • Azure 在每个子网中预留 5 个 IPAzure reserves five IPs in each subnet.
  • 在每个子网中,Azure Databricks 要求每个群集节点有 1 个 IP 地址。Within each subnet, Azure Databricks requires one IP address per cluster node. 每个群集节点总共有两个 IP:主机子网中的主机有一个 IP 地址,容器子网中的容器有一个 IP 地址。In total, there are two IP for each cluster node: one IP address for the host in the host subnet and one IP address for the container in the container subnet.
  • 你可能不希望使用 VNet 的所有地址空间。You may not want to use all the address space of your VNet. 例如,你可能想要在一个 VNet 中创建多个工作区。For example, you might want to create multiple workspaces in one VNet. 由于不能跨工作区共享子网,因此你可能希望子网不要用完 VNet 的地址空间。Because you cannot share subnets across workspaces, you may want subnets that do not use the total VNet address space.
  • 必须为 VNet 地址空间中的两个新子网分配地址空间,并且这个分配的地址空间不得与该 VNet 中当前子网或未来子网的地址空间重叠。You must allocate address space for two new subnets that are within the VNet’s address space and don’t overlap address space of current or future subnets in that VNet.

下表显示了基于网络大小的最大子网大小。The following table shows maximum subnet size based on network size. 此表假定不存在占用地址空间的其他子网。This table assumes no additional subnets exist that take up address space. 如果你有预先存在的子网或要为其他子网预留地址空间,请使用较小的子网:Use smaller subnets if you have pre-existing subnets or if you want to reserve address space for other subnets:

VNet 地址空间 (CIDR)VNet address space (CIDR) 最大 Azure Databricks 子网大小 (CIDR),假定没有其他子网Maximum Azure Databricks subnet size (CIDR) assuming no other subnets
/16 /17
/17 /18
/18 /19
/20 /21
/21 /22
/22 /23
/23 /24
/24 /25

若要根据子网大小查找最大群集节点数,请使用下表。To find the maximum cluster nodes based on the subnet size, use the following table. “每个子网的 IP 地址数”列包含 Azure 预留的 5 个 IP 地址The IP addresses per subnet column includes the five Azure-reserved IP addresses. 最右边的列表示可以同时在已预配了该大小的子网的工作区中运行的群集节点的数目。The rightmost column indicates the number of cluster nodes that can simultaneously run in a workspace that is provisioned with subnets of that size.

子网大小 (CIDR)Subnet size (CIDR) 每个子网的 IP 地址数IP addresses per subnet 最大 Azure Databricks 群集节点数Maximum Azure Databricks cluster nodes
/17 3276832768 3276332763
/18 1638416384 1637916379
/19 81928192 81878187
/20 40964096 40914091
/21 20482048 20432043
/22 10241024 10191019
/23 512512 507507
/24 256256 251251
/25 128128 123123
/26 6464 5959

使用 Azure 门户创建 Azure Databricks 工作区 Create an Azure Databricks workspace using Azure portal

此部分介绍如何在 Azure 门户中创建 Azure Databricks 工作区并将其部署到你现有的 VNet 中。This section describes how to create an Azure Databricks workspace in the Azure portal and deploy it in your own existing VNet. Azure Databricks 会根据指定的 CIDR 范围,使用两个新的子网(如果尚不存在)来更新 VNet。Azure Databricks updates the VNet with two new subnets if those do not exist yet, using CIDR ranges that you specify. 该服务还使用新的网络安全组来更新子网,配置入站和出站规则,最后将工作区部署到已更新的 VNet。The service also updates the subnets with a new network security group, configuring inbound and outbound rules, and finally deploys the workspace to the updated VNet. 如果想要更多地控制 VNet 的配置,请使用 Azure Databricks 提供的 Azure 资源管理器 (ARM) 模板,而不使用门户 UI。For more control over the configuration of the VNet, use Azure-Databricks-supplied Azure Resource Manager (ARM) templates instead of the portal UI. 例如,使用现有的网络安全组或创建你自己的安全规则。For example, use existing network security groups or create your own security rules. 请参阅使用 Azure 资源管理器模板的高级配置See Advanced configuration using Azure Resource Manager templates.

重要

必须为创建工作区的用户分配“网络参与者”角色自定义角色(已为其分配 Microsoft.Network/virtualNetworks/subnets/join/action 操作)。The user who creates the workspace must be assigned the Network contributor role or a custom role that’s assigned the Microsoft.Network/virtualNetworks/subnets/join/action action.

必须配置一个 VNet,以便将 Azure Databricks 工作区部署到其中。You must configure a VNet to which you will deploy the Azure Databricks workspace. 可以使用现有的 VNet,也可以创建新的 VNet,但此 VNet 必须与你计划创建的 Azure Databricks 工作区位于同一区域和订阅中。You can use an existing VNet or create a new one, but the VNet must be in the same region and same subscription as the Azure Databricks workspace that you plan to create. 设置 VNet 大小时,CIDR 范围必须为 /16 到 /24。The VNet must be sized with a CIDR range between /16 and /24. 有关详细信息,请参阅虚拟网络要求For more requirements, see Virtual network requirements.

配置工作区时,可以使用现有子网,也可以使用新子网,但需指定名称和 IP 范围。You can either use existing subnets or specify names and IP ranges for new subnets when you configure your workspace.

  1. 在 Azure 门户中,选择“+ 创建资源”>“分析”>“Azure Databricks”或搜索 Azure Databricks,然后单击“创建”或“+ 添加”以启动“Azure Databricks 服务”对话框 。In the Azure portal, select + Create a resource > Analytics > Azure Databricks or search for Azure Databricks and click Create or + Add to launch the Azure Databricks Service dialog.

  2. 按照在你自己的 VNet 中创建 Azure Databricks 工作区快速入门中所述的配置步骤操作。Follow the configuration steps described in the Create an Azure Databricks workspace in your own VNet quickstart.

  3. 在“网络”选项卡中,选择要在“虚拟网络”字段中使用的 VNet。In the Networking tab, select the VNet that you want to use in the Virtual network field.

    重要

    如果未在选取器中看到网络名称,请确认为工作区指定的 Azure 区域是否与所需 VNet 的 Azure 区域匹配。If you do not see the network name in the picker, confirm that the Azure region that you specified for the workspace matches the Azure region of the desired VNet.

    选择虚拟网络Select virtual network

  4. 为子网命名,并在其大小最大为 /26 的块中提供 CIDR 范围。Name your subnets and provide CIDR ranges in a block up to size /26. 若要查看有关基于 VNet 及其子网的大小的最大群集节点数的指导,请参阅地址空间和最大群集节点数For guidance about maximum cluster nodes based on the size of your VNet and its subnets, see Address space and maximum cluster nodes.

    • 若要指定现有子网,请指定现有子网的确切名称。To specify existing subnets, specify the exact names of the existing subnets. 使用现有子网时,还应将工作区创建表单中的 IP 范围设置为与现有子网的 IP 范围完全匹配。When using existing subnets, also set the IP ranges in the workspace creation form to exactly match the IP ranges of the existing subnets.
    • 若要创建新子网,请指定该 VNet 中尚不存在的子网名称。To create new subnets, specify subnet names that do not already exist in that VNet. 将创建具有指定 IP 范围的子网。The subnets are created with the specified IP ranges. 你必须指定在 VNet 的 IP 范围内的 IP 范围,不得指定已分配给现有子网的 IP 范围。You must specify IP ranges within the IP range of your VNet and not already allocated to existing subnets.

    子网会获得关联的网络安全组规则,这些规则包括允许群集内部通信的规则。The subnets get associated network security group rules that include the rule to allow cluster-internal communication. Azure Databricks 将拥有通过 Microsoft.Databricks/workspaces 资源提供程序更新两个子网的委托权限。Azure Databricks will have delegated permissions to update both subnets via the Microsoft.Databricks/workspaces resource provider. 这些权限仅适用于 Azure Databricks 所需的网络安全组规则,而不适用于你添加的其他网络安全组规则或所有网络安全组中所包含的默认网络安全组规则。These permissions apply only to network security group rules that are required by Azure Databricks, not to other network security group rules that you add or to the default network security group rules included with all network security groups.

  5. 单击“创建”将 Azure Databricks 工作区部署到 VNet。Click Create to deploy the Azure Databricks workspace to the VNet.

    备注

    当工作区部署失败时,仍然会创建工作区,但其状态为失败。When a workspace deployment fails, the workspace is still created but has a failed state. 删除失败的工作区,并创建一个解决部署错误的新工作区。Delete the failed workspace and create a new workspace that resolves the deployment errors. 删除失败的工作区时,托管资源组和任何成功部署的资源也将被删除。When you delete the failed workspace, the managed resource group and any successfully deployed resources are also deleted.

使用 Azure 资源管理器模板的高级配置 Advanced configuration using Azure Resource Manager templates

如果想要更多地控制对 VNet 的配置,则可以使用以下 Azure 资源管理器 (ARM) 模板,而不使用基于门户 UI 的自动 VNet 配置和工作区部署If you want more control over the configuration of the VNet, you can use the following Azure Resource Manager (ARM) templates instead of the portal-UI-based automatic VNet configuration and workspace deployment. 例如,使用现有的子网、现有的网络安全组,或添加你自己的安全规则。For example, use existing subnets, an existing network security group, or add your own security rules.

如果使用自定义 Azure 资源管理器模板或用于 Azure Databricks VNet 注入的工作区模板将工作区部署到现有 VNet,则必须创建主机子网和容器子网,将网络安全组附加到每个子网,并在部署工作区之前将子网委托给 Microsoft.Databricks/workspaces 资源提供程序 。If you are using a custom Azure Resource Manager template or the Workspace Template for Azure Databricks VNet Injection to deploy a workspace to an existing VNet, you must create host and container subnets, attach a network security group to each subnet, and delegate the subnets to the Microsoft.Databricks/workspaces resource provider before deploying the workspace. 部署的每个工作区必须拥有一对单独的子网。You must have a separate pair of subnets for each workspace that you deploy.

全功能模板 All-in-one template

若要使用一个模板来创建 VNet 和 Azure Databricks 工作区,请使用适用于 Azure Databricks VNet 注入工作区的全功能模板To create a VNet and Azure Databricks workspace using one template, use the All-in-one Template for Azure Databricks VNet Injected Workspaces.

虚拟网络模板 Virtual network template

若要使用一个模板来创建具有适当子网的 VNet,请使用适用于 Databricks VNet 注入的 VNet 模板To create a VNet with the proper subnets using a template, use the VNet Template for Databricks VNet Injection.

Azure Databricks 工作区模板 Azure Databricks workspace template

若要使用一个模板将 Azure Databricks 工作区部署到现有的 VNet,请使用适用于 Azure Databricks VNet 注入的工作区模板To deploy an Azure Databricks workspace to an existing VNet with a template, use the Workspace Template for Azure Databricks VNet Injection.

使用工作区模板可以指定现有的 VNet 并使用现有的子网:The workspace template allows you to specify an existing VNet and use existing subnets:

  • 部署的每个工作区必须拥有一对单独的主机/容器子网。You must have a separate pair of host/container subnets for each workspace that you deploy. 不支持跨工作区共享子网或在 Azure Databricks 工作区使用的子网上部署其他 Azure 资源。It is unsupported to share subnets across workspaces or to deploy other Azure resources on the subnets that are used by your Azure Databricks workspace.
  • 在使用此 Azure 资源管理器模板部署工作区之前,VNet 的主机子网和容器子网必须附加网络安全组,并且必须将其委托给 Microsoft.Databricks/workspaces 服务。Your VNet’s host and container subnets must have network security groups attached and must be delegated to the Microsoft.Databricks/workspaces service before you use this Azure Resource Manager template to deploy a workspace.
  • 若要创建其子网已进行适当委托的 VNet,请使用适用于 Databricks VNet 注入的 VNet 模板To create a VNet with properly delegated subnets, use the VNet Template for Databricks VNet Injection.
  • 若要在尚未委托主机子网和容器子网的情况下使用现有的 VNet,请参阅添加或删除子网委托将 VNet 注入预览版工作区升级到正式发行版To use an existing VNet when you have not yet delegated the host and container subnets, see Add or remove a subnet delegation or Upgrade your VNet Injection preview workspace to GA.

网络安全组规则 Network security group rules

下表显示了 Azure Databricks 使用的最新网络安全组规则。The following tables display the current network security group rules used by Azure Databricks. 如果 Azure Databricks 需要添加规则或更改此列表中现有规则的范围,你将收到预先通知。If Azure Databricks needs to add a rule or change the scope of an existing rule on this list, you will receive advance notice. 每当发生此类修改时,本文和各表都会更新。This article and the tables will be updated whenever such a modification occurs.

本节内容:In this section:

Azure Databricks 如何管理网络安全组规则How Azure Databricks manages network security group rules

下面各部分列出的 NSG 规则表示 Azure Databricks 通过将 VNet 的主机子网和容器子网委托给 Microsoft.Databricks/workspaces 服务,在 NSG 中自动预配和管理的规则。The NSG rules listed in the following sections represent those that Azure Databricks auto-provisions and manages in your NSG, by virtue of the delegation of your VNet’s host and container subnets to the Microsoft.Databricks/workspaces service. 你无权更新或删除这些 NSG 规则;子网委托会阻止任何更新或删除操作。You do not have permission to update or delete these NSG rules; any attempt to do so is blocked by the subnet delegation. Azure Databricks 必须拥有这些规则才能确保 Microsoft 能够在 VNet 中可靠地运行和支持 Azure Databricks 服务。Azure Databricks must own these rules in order to ensure that Microsoft can reliably operate and support the Azure Databricks service in your VNet.

其中一些 NSG 规则将 VirtualNetwork 指定为源和目标。Some of these NSG rules have VirtualNetwork assigned as the source and destination. 在 Azure 中缺少子网级服务标记的情况下,这样做可以简化设计。This has been implemented to simplify the design in the absence of a subnet-level service tag in Azure. 所有群集都在内部受到第二层网络策略的保护,这样群集 A 就无法连接到同一工作区中的群集 B。All clusters are protected by a second layer of network policy internally, such that cluster A cannot connect to cluster B in the same workspace. 如果工作区部署到同一个由客户管理的 VNet 中的一对不同的子网中,则这也适用于多个工作区。This also applies across multiple workspaces if your workspaces are deployed into a different pair of subnets in the same customer-managed VNet.

重要

如果工作区 VNet 与另一个由客户管理的网络对等互连,或者在其他子网中预配了非 Azure Databricks 资源,Databricks 建议你向附加到其他网络和子网的 NSG 添加“拒绝”入站规则,以阻止来自 Azure Databricks 群集的源流量。If the workspace VNet is peered to another customer-managed network, or if non-Azure Databricks resources are provisioned in other subnets, Databricks recommends that you add Deny inbound rules to the NSGs attached to the other networks and subnets to block source traffic from Azure Databricks clusters. 你不需要为希望 Azure Databricks 群集连接到的资源添加此类规则。You do not need to add such rules for resources that you want your Azure Databricks clusters to connect to.

2020 年 1 月 13 日之后创建的工作区的网络安全组规则Network security group rules for workspaces created after January 13, 2020

下表仅适用于 2020 年 1 月 13 日之后创建的 Azure Databricks 工作区。The following table only applies to Azure Databricks workspaces created after January 13, 2020. 如果工作区是在 2020 年 1 月 13 日发布安全群集连接 (SCC) 之前创建的,请参阅下表。If your workspace was created before the release of secure cluster connectivity (SCC) on January 13, 2020, see the following table.

重要

下表包含两个仅在禁用安全群集连接 (SCC) 的情况下包含的入站安全组规则。The following table includes two inbound security group rules that are included only if secure cluster connectivity (SCC) is disabled.

方向Direction 协议Protocol Source Source PortSource Port 目标Destination Dest PortDest Port 已使用Used
入站Inbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
入站Inbound TCPTCP AzureDatabricks(服务标记)AzureDatabricks (service tag)
仅当 SCC 已禁用Only if SCC is disabled
任意Any VirtualNetworkVirtualNetwork 2222 公共 IPPublic IP
入站Inbound TCPTCP AzureDatabricks(服务标记)AzureDatabricks (service tag)
仅当 SCC 已禁用Only if SCC is disabled
任意Any VirtualNetworkVirtualNetwork 55575557 公共 IPPublic IP
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any AzureDatabricks(服务标记)AzureDatabricks (service tag) 443443 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any SQLSQL 33063306 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any 存储Storage 443443 默认Default
出站Outbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any EventHubEventHub 90939093 默认Default

2020 年 1 月 13 日之前创建的工作区的网络安全组规则Network security group rules for workspaces created before January 13, 2020

下表仅适用于 2020 年 1 月 13 日之前创建的 Azure Databricks 工作区。The following table only applies to Azure Databricks workspaces created before January 13, 2020. 如果工作区是在 2020 年 1 月 13 日当天或之后创建的,请参阅上表。If your workspace was created on or after January 13, 2020, see the previous table.

方向Direction 协议Protocol Source Source PortSource Port 目标Destination Dest PortDest Port 已使用Used
入站Inbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
入站Inbound TCPTCP ControlPlane IPControlPlane IP 任意Any VirtualNetworkVirtualNetwork 2222 公共 IPPublic IP
入站Inbound TCPTCP ControlPlane IPControlPlane IP 任意Any VirtualNetworkVirtualNetwork 55575557 公共 IPPublic IP
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any Webapp IPWebapp IP 443443 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any SQLSQL 33063306 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any 存储Storage 443443 默认Default
出站Outbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any EventHubEventHub 90939093 默认Default

重要

Azure Databricks 是部署在全局 Azure 公有云基础结构上的 Microsoft Azure 第一方服务。Azure Databricks is a Microsoft Azure first-party service that is deployed on the Global Azure Public Cloud infrastructure. 服务组件之间的所有通信(包括控制平面和客户数据平面中的公共 IP 之间的通信)都留在 Microsoft Azure 网络主干内进行。All communications between components of the service, including between the public IPs in the control plane and the customer data plane, remain within the Microsoft Azure network backbone. 另请参阅 Microsoft 全球网络See also Microsoft global network.

故障排除 Troubleshooting

工作区创建错误Workspace creation errors

子网需要以下任意委托 [Microsoft.Databricks/workspaces] 来引用服务关联链接Subnet requires any of the following delegation(s) [Microsoft.Databricks/workspaces] to reference service association link

可能的原因:创建工作区时所在的 VNet 的主机子网和容器子网尚未委托给 Microsoft.Databricks/workspaces 服务。Possible cause: you are creating a workspace in a VNet whose host and container subnets have not been delegated to the Microsoft.Databricks/workspaces service. 每个子网必须附加并正确委托网络安全组。Each subnet must have a network security group attached and must be properly delegated. 有关详细信息,请参阅虚拟网络要求See Virtual network requirements for more information.

子网已被工作区占用The subnet is already in use by workspace

可能的原因:创建工作区时所在的 VNet 的主机子网和容器子网已经被现有 Azure Databricks 工作区占用。Possible cause: you are creating a workspace in a VNet with host and container subnets that are already being used by an existing Azure Databricks workspace. 不能在单个子网中共享多个工作区。You cannot share multiple workspaces across a single subnet. 部署的每个工作区必须拥有一对新的主机子网和容器子网。You must have a new pair of host and container subnets for each workspace you deploy.

故障排除Troubleshooting

实例不可访问:无法通过 SSH 访问资源。Instances Unreachable: Resources were not reachable via SSH.

可能的原因:阻止了从控制平面到辅助角色的流量。Possible cause: traffic from control plane to workers is blocked. 如果要部署到已连接到本地网络的现有 VNet,请根据将 Azure Databricks 工作区连接到本地网络中提供的信息检查安装情况。If you are deploying to an existing VNet connected to your on-premises network, review your setup using the information supplied in Connect your Azure Databricks Workspace to your on-premises network.

启动意外失败:设置群集时遇到意外错误。请重试,如果问题仍然存在,请联系 Azure Databricks。内部错误消息:Timeout while placing nodeUnexpected Launch Failure: An unexpected error was encountered while setting up the cluster. Please retry and contact Azure Databricks if the problem persists. Internal error message: Timeout while placing node.

可能的原因:从辅助角色到 Azure 存储终结点的流量被阻止。Possible cause: traffic from workers to Azure Storage endpoints is blocked. 如果使用自定义 DNS 服务器,请同时检查 VNet 中 DNS 服务器的状态。If you are using custom DNS servers, also check the status of the DNS servers in your VNet.

云提供程序启动失败:在设置群集时遇到云提供程序错误。请参阅 Azure Databricks 指南以获取详细信息。Azure 错误代码:AuthorizationFailed/InvalidResourceReference.Cloud Provider Launch Failure: A cloud provider error was encountered while setting up the cluster. See the Azure Databricks guide for more information. Azure error code: AuthorizationFailed/InvalidResourceReference.

可能的原因:VNet 或子网不再存在。Possible cause: the VNet or subnets do not exist any more. 请确保 VNet 和子网存在。Make sure the VNet and subnets exist.

群集已终止。原因:Spark 启动失败:Spark 未能及时启动。此问题可能是由 Hive 元存储发生故障、Spark 配置无效或初始化脚本出现故障而导致的。请参阅 Spark 驱动程序日志以解决此问题,如果问题仍然存在,请联系 Databricks。内部错误消息:Spark failed to start: Driver failed to start in timeCluster terminated. Reason: Spark Startup Failure: Spark was not able to start in time. This issue can be caused by a malfunctioning Hive metastore, invalid Spark configurations, or malfunctioning init scripts. Please refer to the Spark driver logs to troubleshoot this issue, and contact Databricks if the problem persists. Internal error message: Spark failed to start: Driver failed to start in time.

可能的原因:容器无法与托管实例或 DBFS 存储帐户通信。Possible cause: Container cannot talk to hosting instance or DBFS storage account. 解决方法是为 DBFS 存储帐户添加指向子网的自定义路由,并将下一个跃点设为 Internet。Fix by adding a custom route to the subnets for the DBFS storage account with the next hop being Internet.