在 Azure 虚拟网络中部署 Azure Databricks(VNet 注入)Deploy Azure Databricks in your Azure virtual network (VNet injection)

重要

如果在 VNet 注入预览期间,在自己的 Azure 虚拟网络中部署了 Azure Databricks 工作区,并且尚未升级到正式发行版,在 6 月 1 日将失去对工作区的访问权限。If you deployed an Azure Databricks workspace in your own Azure Virtual Network during the VNet injection preview and have not yet upgraded to GA, you will lose access to your workspace on June 1st. 在 6 月 1 日当天或之后,必须遵照升级步骤操作,然后打开支持工单才能重新访问工作区。On or after June 1st, you must follow the upgrade steps and then open a support ticket to regain access to your workspace. 请参阅将“VNet 注入”预览版工作区升级到正式发行版See Upgrade your VNet Injection Preview Workspace to GA.

Azure Databricks 的默认部署是 Azure 上的完全托管服务:所有数据平面资源(包括与所有群集关联的虚拟网络 (VNet))都部署到锁定的资源组。The default deployment of Azure Databricks is a fully managed service on Azure: all data plane resources, including a virtual network (VNet) that all clusters will be associated with, are deployed to a locked resource group. 但如果需要自定义网络,则可以在自己的虚拟网络(有时称为 VNet 注入)中部署 Azure Databricks 数据平面资源,以便能够:If you require network customization, however, you can deploy Azure Databricks data plane resources in your own virtual network (sometimes called VNet injection ), enabling you to:

通过将 Azure Databricks 数据平面资源部署到自己的虚拟网络中,还可以利用灵活的 CIDR 范围(虚拟网络的 CIDR 范围在 /16-/24 之间,子网最高可达 /26)。Deploying Azure Databricks data plane resources to your own virtual network also lets you take advantage of flexible CIDR ranges (anywhere between /16-/24 for the virtual network and up to /26 for the subnets).

重要

不能替换现有工作区的虚拟网络。You cannot replace the virtual network for an existing workspace. 如果当前工作区无法容纳所需数量的活动群集节点,我们建议在较大的虚拟网络中另外创建一个工作区。If your current workspace cannot accommodate the required number of active cluster nodes, we recommend that you create another workspace in a larger virtual network. 按照这些详细的迁移步骤将资源(笔记本、群集配置、作业)从旧工作区复制到新工作区。Follow these detailed migration steps to copy resources (notebooks, cluster configurations, jobs) from the old to new workspace.

虚拟网络要求 Virtual network requirements

将 Azure Databricks 工作区部署到其中的虚拟网络必须满足以下要求:The virtual network that you deploy your Azure Databricks workspace to must meet the following requirements:

  • 位置: 该虚拟网络必须与 Azure Databricks 工作区位于同一位置。Location: The virtual network must reside in the same location as the Azure Databricks workspace.

  • 订阅: 虚拟网络必须与 Azure Databricks 工作区位于同一订阅中。Subscription: The virtual network must be in the same subscription as the Azure Databricks workspace.

  • 子网: 虚拟网络必须包含两个专用于 Azure Databricks 的子网:专用子网和公共子网。Subnets: The virtual network must include two subnets dedicated to Azure Databricks: a private subnet and public subnet. 公共子网允许与 Azure Databricks 控制平面进行通信。The public subnet allows communication with the Azure Databricks control plane. 专用子网只允许群集内部通信。The private subnet allows only cluster-internal communication. 不要在 Azure Databricks 工作区使用的子网上部署其他 Azure 资源。Do not deploy other Azure resources on the subnet used by your Azure Databricks workspace. 与其他资源(如虚拟机)共享子网会阻止对子网的意向策略进行托管更新。Sharing the subnet with other resources, such as virtual machines, prevents managed updates to the intent policy for the subnet.

    在 Azure 门户中创建 Azure Databricks 工作区时,Azure Databricks 会为你创建这些要求,并在工作区部署期间将公共和专用子网委托给 Microsoft.Databricks/workspaces 服务,使 Azure Databricks 能够创建网络安全组规则Azure Databricks will create these for you when you Create the Azure Databricks workspace in the Azure portal and will delegate your public and private subnet to the Microsoft.Databricks/workspaces service during workspace deployment, which allows Azure Databricks to create Network security group rules. 如果需要添加或更新 Azure Databricks 托管的 NSG 规则的范围,Azure Databricks 将始终提前通知。Azure Databricks will always give advance notice if we need to add or update the scope of an Azure Databricks-managed NSG rule. 如果使用全功能 Azure 资源管理器模板或仅限 VNet的 Azure 资源管理器模板,则 Azure Databricks 还会为你创建子网。Azure Databricks will also create subnets for you if you use the All-in-one Azure Resource Manager template or the VNet-only Azure Resource Manager template. 但如果使用自定义 Azure 资源管理器模板或工作区 Azure 资源管理器模板,则由你来确保子网已附加并正确委托了网络安全组。However, if you use a custom Azure Resource Manager template or the Workspace Azure Resource Manager template, it is up to you to ensure that the subnets have network security groups attached and are properly delegated. 有关委托说明,请参阅将“VNet 注入”预览版工作区升级到正式发行版添加或删除子网委托For delegation instructions, see Upgrade your VNet Injection preview workspace to GA or Add or remove a subnet delegation.

    重要

    这些子网和 Azure Databricks 工作区之间存在一对一关系。There is a one-to-one relationship between these subnets and an Azure Databricks workspace. 不能在单个子网中共享多个工作区。You cannot share multiple workspaces across a single subnet. 部署的每个工作区必须拥有一对新的公共和专用子网。You must have a new pair of public and private subnets for each workspace you deploy.

  • 地址空间: 虚拟网络为 /16-/24 之间的 CIDR 块,专用子网和公共子网为最高可达 /26 的 CIDR 块。Address space: A CIDR block between /16 - /24 for the virtual network and a CIDR block up to /26 for the private and public subnets.

可以使用 Azure 门户中的 Azure Databricks 工作区部署接口来为现有虚拟网络自动配置所需子网,也可以使用 Azure Databricks 提供的 Azure 资源管理器模板来配置虚拟网络和部署工作区。You can use the Azure Databricks workspace deployment interface in the Azure portal to automatically configure an existing virtual network with the required subnets, or you can use Azure-Databricks-supplied Azure Resource Manager templates to configure your virtual network and deploy your workspace.

在 Azure 门户中创建 Azure Databricks 工作区 Create the Azure Databricks workspace in the Azure portal

本部分介绍如何在 Azure 门户中创建 Azure Databricks 工作区,并将其部署到自己现有的虚拟网络中。This section describes how to create an Azure Databricks workspace in the Azure portal and deploy it in your own existing virtual network. Azure Databricks 使用你提供的 CIDR 范围内的两个新的子网和网络安全组更新虚拟网络,将入站和出站子网流量列入允许列表,并将工作区部署到更新的虚拟网络。Azure Databricks updates the virtual network with two new subnets and network security groups using CIDR ranges provided by you, whitelists inbound and outbound subnet traffic, and deploys the workspace to the updated virtual network.

备注

如果想要更多地控制虚拟网络的配置(例如,你可能想要使用现有子网、使用现有网络安全组,或者创建自己的安全规则),则可以使用 Azure-Databricks 提供的 Azure 资源管理器模板替代门户 UI。If you want more control over the configuration of the virtual network—for example, you may want to use existing subnets, use existing network security groups, or create your own security rules—you can use Azure-Databricks-supplied Azure Resource Manager templates instead of the portal UI. 请参阅配置虚拟网络See Configure the virtual network.

要求Requirements

必须配置将部署 Azure Databricks 工作区的虚拟网络You must configure a virtual network to which you will deploy the Azure Databricks workspace:

  • 可以使用现有的虚拟网络,也可以创建新的虚拟网络,但是虚拟网络必须与你计划创建的 Azure Databricks 工作区位于同一区域和订阅中。You can use an existing virtual network or create a new one, but the virtual network must be in the same region and same subscription as the Azure Databricks workspace that you plan to create.

  • 虚拟网络需要介于 /16-/24 之间的 CIDR 范围。A CIDR range between /16 - /24 is required for the virtual network.

    警告

    具有较小虚拟网络的工作区比具有较大虚拟网络的工作区会更快地用完 IP 地址(网络空间)。A workspace with a smaller virtual network can run out of IP addresses (network space) more quickly than a workspace with a larger virtual network. 例如,具有 /24 虚拟网络和 /26 子网的工作区一次最多可以有 64 个活动节点,而具有 /20 虚拟网络和 /22 子网的工作区最多可以容纳 1024 个节点。For example, a workspace with a /24 virtual network and /26 subnets can have a maximum of 64 nodes active at a time, whereas a workspace with a /20 virtual network and /22 subnets can house a maximum of 1024 nodes.

    子网将在你配置工作区时自动创建,并且你将有机会在配置期间为子网提供 CIDR 范围。Your subnets will be created automatically when you configure your workspace, and you will have the opportunity to provide the CIDR range for the subnets during configuration.

配置虚拟网络 Configure the virtual network

  1. 在 Azure 门户中,选择“+ 创建资源”>“分析”>“Azure Databricks”或搜索 Azure Databricks,然后单击“创建”或“+ 添加”以启动“Azure Databricks 服务”对话框 。In the Azure portal, select + Create a resource > Analytics > Azure Databricks or search for Azure Databricks and click Create or + Add to launch the Azure Databricks Service dialog.

  2. 遵循在自己的虚拟网络中部署 Azure Databricks 工作区快速入门中所述的配置步骤。Follow the configuration steps described in the Create an Azure Databricks workspace in your own Virtual Network quickstart.

  3. 选择要使用的虚拟网络。Select the virtual network you want to use.

    选择虚拟网络Select virtual network

  4. 命名公共和专用子网,并在块中提供最高可达 /26 的 CIDR 范围:Name your public and private subnets and provide CIDR ranges in a block up to /26:

    • 使用相关的网络安全组规则创建一个公共子网,该规则允许与 Azure Databricks 控制平面进行通信。A public subnet will be created with associated Network security group rules that allow communication with the Azure Databricks control plane.
    • 使用允许群集内部通信的相关网络安全组规则创建专用子网。A private subnet will be created with associated network security group rules that allow cluster-internal communication.
    • Azure Databricks 将拥有通过 Microsoft.Databricks/workspaces 服务更新两个子网的委托权限。Azure Databricks will have delegated permissions to update both subnets via the Microsoft.Databricks/workspaces service. 这些权限仅适用于 Azure Databricks 所需的网络安全组规则,而不适用于你添加的其他网络安全组规则或所有网络安全组中所包含的默认网络安全组规则。These permissions apply only to network security group rules that are required by Azure Databricks, not to other network security group rules that you add or to the default network security group rules included with all network security groups.
  5. 单击“创建”将 Azure Databricks 工作区部署到虚拟网络。Click Create to deploy the Azure Databricks workspace to the virtual network.

备注

当工作区部署失败时,仍然会在失败状态下创建工作区。When a workspace deployment fails, the workspace is still created in a failed state. 删除失败的工作区,并创建一个解决部署错误的新工作区。Delete the failed workspace and create a new workspace that resolves the deployment errors. 删除失败的工作区时,托管资源组和任何成功部署的资源也将被删除。When you delete the failed workspace, the managed resource group and any successfully deployed resources are also deleted.

使用 Azure 资源管理器模板的高级配置Advanced configuration using Azure Resource Manager templates

如果想要更多地控制对虚拟网络的配置(例如,你想要使用现有子网、使用现有网络安全组,或者添加自己的安全规则),则可以使用以下 Azure 资源管理器 (ARM) 模板替代基于门户 UI 的自动虚拟网络配置和工作区部署If you want more control over the configuration of the virtual network—for example, you want to use existing subnets, use an existing network security group, or add your own security rules—you can use the following Azure Resource Manager (ARM) templates instead of the portal-UI-based automatic virtual network configuration and workspace deployment.

重要

如果在 VNet 注入预览期间,在自己的 Azure 虚拟网络中部署了 Azure Databricks 工作区,则必须在 2020 年 3 月 31 日之前将预览工作区升级到正式发行版。If you deployed an Azure Databricks workspace in your own Azure Virtual Network during the VNet injection preview, you must upgrade your preview workspace to the GA version before March 31, 2020. 主要的升级任务是将公共和专用子网委托给 Microsoft.Databricks/workspaces 服务,使 Azure Databricks 能够创建网络安全组规则The primary upgrade task is to delegate your public and private subnets to the Microsoft.Databricks/workspaces service, which allows Azure Databricks to create Network security group rules. 如果需要添加或更新 Azure Databricks 托管 NSG 规则的范围,Azure Databricks 始终会提前通知。Azure Databricks always gives advance notice if we need to add or update the scope of an Azure Databricks-managed NSG rule. 请参阅将“VNet 注入”预览版工作区升级到正式发行版See Upgrade your VNet Injection preview workspace to GA.

如果在预览期间使用 Azure 资源管理器模板将 Azure Databricks 工作区部署到自己的虚拟网络,并且想要继续使用 Azure 资源管理器模板来创建虚拟网络和部署工作区,则应使用以下升级的 Azure 资源管理器模板。If you used Azure Resource Manager templates to deploy an Azure Databricks workspace to your own virtual network during the preview, and you want to continue to use Azure Resource Manager templates to create virtual networks and deploy workspaces, you should use the following upgraded Azure Resource Manager templates. 按照模板链接获取最新版本。Follow the template link to get the latest version.

如果使用自定义 Azure 资源管理器模板或用于 Databricks VNet 注入的工作区模板将工作区部署到现有虚拟网络,则必须创建公共和专用子网,将网络安全组附加到每个子网,并在部署工作区之前将子网委托给 Microsoft.Databricks/workspaces 服务 。If you are using a custom Azure Resource Manager template or the Workspace Template for Databricks VNet Injection to deploy a workspace to an existing virtual network, you must create public and private subnets, attach a network security group to each subnet, and delegate the subnets to the Microsoft.Databricks/workspaces service before deploying the workspace. 部署的每个工作区必须拥有一对单独的公共/专用子网。You must have a separate pair of public/private subnets for each workspace that you deploy.

全功能模板 All in one template

要在一个模板中创建虚拟网络和 Azure Databricks 工作区,请使用适用于 Databricks VNet 注入工作区的全功能模板To create a virtual network and Azure Databricks workspace all in one, use the All-in-one Template for Databricks VNet Injected Workspaces.

虚拟网络模板 Virtual network template

要创建具有适当公共和专用子网的虚拟网络,请使用适用于 Databricks VNet 注入的虚拟网络模板To create a virtual network with the proper public and private subnets, use the Virtual Network Template for Databricks VNet Injection.

Azure Databricks 工作区模板 Azure Databricks workspace template

要将 Azure Databricks 工作区部署到现有的虚拟网络,请使用适用于 Databricks VNet 注入的工作区模板To deploy an Azure Databricks workspace to an existing virtual network, use the Workspace Template for Databricks VNet Injection.

重要

在使用此 Azure 资源管理器模板部署工作区之前,虚拟网络的公共和专用子网必须附加网络安全组,并且必须将其委托给 Microsoft.Databricks/workspaces 服务。Your virtual network’s public and private subnets must have network security groups attached and must be delegated to the Microsoft.Databricks/workspaces service before you use this Azure Resource Manager template to deploy a workspace. 可以使用适用于 Databricks VNet 注入的虚拟网络模板来创建一个具有适当委托子网的虚拟网络。You can use the Virtual Network Template for Databricks VNet Injection to create a virtual network with properly delegated subnets. 如果使用现有的虚拟网络,并且尚未委托公共和专用子网,请参阅添加或删除子网委托将 VNet 注入预览工作区升级到正式发行版If you are using an existing virtual network and you have not delegated the public and private subnets, see Add or remove a subnet delegation or Upgrade your VNet Injection preview workspace to GA.

部署的每个工作区必须拥有一对单独的公共/专用子网。You must have a separate pair of public/private subnets for each workspace that you deploy.

网络安全组规则 Network security group rules

下表显示了 Azure Databricks 使用的最新网络安全组规则。The following tables display the current network security group rules used by Azure Databricks. 如果 Azure Databricks 需要添加规则或更改此列表中现有规则的范围,你将收到预先通知。If Azure Databricks needs to add a rule or change the scope of an existing rule on this list, you will receive advance notice. 每当发生此类修改时,本文和各表都会更新。This article and the tables will be updated whenever such a modification occurs.

本节内容:In this section:

Azure Databricks 如何管理网络安全组规则How Azure Databricks manages network security group rules

下面各部分中列出的 NSG 规则表示 Azure Databricks 通过将虚拟网络的专用和公共子网委托给 Microsoft.Databricks/workspaces 服务,在 NSG 中自动预配和管理的规则。The NSG rules listed in the following sections represent those that Azure Databricks auto-provisions and manages in your NSG, by virtue of the delegation of your virtual network’s private and public subnets to the Microsoft.Databricks/workspaces service. 你无权更新或删除这些 NSG 规则;子网委托会阻止任何更新或删除操作。You do not have permission to update or delete these NSG rules; any attempt to do so is blocked by the subnet delegation. Azure Databricks 必须拥有这些规则才能确保 Microsoft 能够在虚拟网络中可靠地运行和支持 Azure Databricks 服务。Azure Databricks must own these rules in order to ensure that Microsoft can reliably operate and support the Azure Databricks service in your virtual network. 如果 Azure Databricks 需要添加规则或更改此列表中现有规则的范围,你将收到预先通知。If Azure Databricks needs to add a rule or change the scope of an existing rule on this list, you will receive advance notice.

其中一些 NSG 规则将 VirtualNetwork 指定为源和目标。Some of these NSG rules have VirtualNetwork assigned as the source and destination. 在 Azure 中缺少子网级服务标记的情况下,这样做可以简化设计。This has been implemented to simplify the design in the absence of a subnet-level service tag in Azure. 所有群集都在内部受到第二层网络策略的保护,这样群集 A 就无法连接到同一工作区中的群集 B。All clusters are protected by a second layer of network policy internally, such that cluster A cannot connect to cluster B in the same workspace. 如果工作区部署到同一客户管理的虚拟网络中的一对不同的子网中,这也适用于多个工作区。This also applies across multiple workspaces if your workspaces are deployed into a different pair of subnets in the same customer-managed virtual network.

重要

如果工作区虚拟网络与另一个客户管理的网络对等互连,或者如果在其他子网中配置了非 Azure Databricks 资源,Databricks 建议你向附加到其他网络和子网的 NSG 添加拒绝入站规则,以阻止来自 Azure Databricks 群集的源流量。If the workspace virtual network is peered to another customer-managed network, or if non-Azure Databricks resources are provisioned in other subnets, Databricks recommends that you add Deny inbound rules to the NSGs attached to the other networks and subnets to block source traffic from Azure Databricks clusters. 你不需要为希望 Azure Databricks 群集连接到的资源添加此类规则。You do not need to add such rules for resources that you want your Azure Databricks clusters to connect to.

2020 年 1 月 13 日之后创建的工作区的网络安全组规则Network security group rules for workspaces created after January 13, 2020

下表仅适用于 2020 年 1 月 13 日之后创建的 Azure Databricks 工作区。The following table only applies to Azure Databricks workspaces created after January 13, 2020. 如果工作区是在 2020 年 1 月 13 日之前创建的,请参阅下表。If your workspace was created prior to January 13, 2020, see the next table.

方向Direction 协议Protocol Source Source PortSource Port 目标Destination Dest PortDest Port 已使用Used
入站Inbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
入站Inbound TCPTCP AzureDatabricks(服务标记)AzureDatabricks (service tag) 任意Any VirtualNetworkVirtualNetwork 2222 公共 IPPublic IP
入站Inbound TCPTCP AzureDatabricks(服务标记)AzureDatabricks (service tag) 任意Any VirtualNetworkVirtualNetwork 55575557 公共 IPPublic IP
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any AzureDatabricks(服务标记)AzureDatabricks (service tag) 443443 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any SQLSQL 33063306 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any 存储Storage 443443 默认Default
出站Outbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any EventHubEventHub 90939093 默认Default

2020 年 1 月 13 日之前创建的工作区的网络安全组规则Network security group rules for workspaces created before January 13, 2020

下表仅适用于 2020 年 1 月 13 日之前创建的 Azure Databricks 工作区。The following table only applies to Azure Databricks workspaces created before January 13, 2020. 如果工作区是在 2020 年 1 月 13 日当天或之后创建的,请参阅上表。If your workspace was created on or after January 13, 2020, see the previous table.

方向Direction 协议Protocol Source Source PortSource Port 目标Destination Dest PortDest Port 已使用Used
入站Inbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
入站Inbound TCPTCP ControlPlane IPControlPlane IP 任意Any VirtualNetworkVirtualNetwork 2222 公共 IPPublic IP
入站Inbound TCPTCP ControlPlane IPControlPlane IP 任意Any VirtualNetworkVirtualNetwork 55575557 公共 IPPublic IP
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any Webapp IPWebapp IP 443443 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any SQLSQL 33063306 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any 存储Storage 443443 默认Default
出站Outbound 任意Any VirtualNetworkVirtualNetwork 任意Any VirtualNetworkVirtualNetwork 任意Any 默认Default
出站Outbound TCPTCP VirtualNetworkVirtualNetwork 任意Any EventHubEventHub 90939093 默认Default

重要

Azure Databricks 是部署在全局 Azure 公有云基础结构上的 Microsoft Azure 第一方服务。Azure Databricks is a Microsoft Azure first-party service that is deployed on the Global Azure Public Cloud infrastructure. 服务组件之间的所有通信(包括控制平面和客户数据平面中的公共 IP 之间的通信)都留在 Microsoft Azure 网络主干内进行。All communications between components of the service, including between the public IPs in the control plane and the customer data plane, remain within the Microsoft Azure network backbone. 另请参阅 Microsoft 全球网络See also Microsoft global network.

故障排除 Troubleshooting

工作区创建错误Workspace creation errors

子网需要以下任意委托 [Microsoft.Databricks/workspaces] 来引用服务关联链接Subnet requires any of the following delegation(s) [Microsoft.Databricks/workspaces] to reference service association link

可能的原因:创建工作区所在的 VNet 的专用子网和公共子网尚未委托给 Microsoft.Databricks/workspaces 服务。Possible cause: you are creating a workspace in a VNet whose private and public subnets have not been delegated to the Microsoft.Databricks/workspaces service. 每个子网必须附加并正确委托网络安全组。Each subnet must have a network security group attached and must be properly delegated. 有关详细信息,请参阅虚拟网络要求See Virtual network requirements for more information.

子网已被工作区占用The subnet is already in use by workspace

可能的原因:创建工作区所在的 VNet 的专用和公共子网已经被现有 Azure Databricks 工作区占用。Possible cause: you are creating a workspace in a VNet with private and public subnets that are already being used by an existing Azure Databricks workspace. 不能在单个子网中共享多个工作区。You cannot share multiple workspaces across a single subnet. 部署的每个工作区必须拥有一对新的公共和专用子网。You must have a new pair of public and private subnets for each workspace you deploy.

群集创建错误Cluster creation errors

实例不可访问:无法通过 SSH 访问资源。Instances Unreachable: Resources were not reachable via SSH.

可能的原因:阻止了从控制平面到辅助角色的流量。Possible cause: traffic from control plane to workers is blocked. 如果要部署到连接到本地网络的现有虚拟网络,请利用将 Azure Databricks 工作区连接到本地网络中提供的信息检查安装。If you are deploying to an existing virtual network connected to your on-premises network, review your setup using the information supplied in Connect your Azure Databricks Workspace to your on-premises network.

启动意外失败:设置群集时遇到意外错误。请重试,如果问题仍然存在,请联系 Azure Databricks。内部错误消息:Timeout while placing nodeUnexpected Launch Failure: An unexpected error was encountered while setting up the cluster. Please retry and contact Azure Databricks if the problem persists. Internal error message: Timeout while placing node.

可能的原因:从辅助角色到 Azure 存储终结点的流量被阻止。Possible cause: traffic from workers to Azure Storage endpoints is blocked. 如果使用自定义 DNS 服务器,请同时检查虚拟网络中 DNS 服务器的状态。If you are using custom DNS servers, also check the status of the DNS servers in your virtual network.

云提供程序启动失败:在设置群集时遇到云提供程序错误。请参阅 Azure Databricks 指南以获取详细信息。Azure 错误代码:AuthorizationFailed/InvalidResourceReference.Cloud Provider Launch Failure: A cloud provider error was encountered while setting up the cluster. See the Azure Databricks guide for more information. Azure error code: AuthorizationFailed/InvalidResourceReference.

可能的原因:虚拟网络或子网不存在。Possible cause: the virtual network or subnets do not exist any more. 请确保虚拟网络和子网存在。Make sure the virtual network and subnets exist.

群集已终止。原因:Spark 启动失败:Spark 未能及时启动。此问题可能是由 Hive 元存储发生故障、Spark 配置无效或初始化脚本出现故障而导致的。请参阅 Spark 驱动程序日志以解决此问题,如果问题仍然存在,请联系 Databricks。内部错误消息:Spark failed to start: Driver failed to start in timeCluster terminated. Reason: Spark Startup Failure: Spark was not able to start in time. This issue can be caused by a malfunctioning Hive metastore, invalid Spark configurations, or malfunctioning init scripts. Please refer to the Spark driver logs to troubleshoot this issue, and contact Databricks if the problem persists. Internal error message: Spark failed to start: Driver failed to start in time.

可能的原因:容器无法与托管实例或 DBFS 存储帐户通信。Possible cause: Container cannot talk to hosting instance or DBFS storage account. 解决方法是为 DBFS 存储帐户添加指向子网的自定义路由,并将下一个跃点设为 Internet。Fix by adding a custom route to the subnets for the DBFS storage account with the next hop being Internet.