在将云服务(经典)部署到 Azure 时对 FabricInternalServerError 或 ServiceAllocationFailure 进行故障排除Troubleshoot FabricInternalServerError or ServiceAllocationFailure when deploying a Cloud service (classic) to Azure

本文将介绍如何解决在部署 Azure 云服务(经典)时结构控制器无法分配的分配失败问题。In this article, you'll troubleshoot allocation failures where the fabric controller cannot allocate when deploying an Azure Cloud service (classic).

将实例部署到云服务或者添加新的 Web 角色或辅助角色实例时,Azure 会分配计算资源。When you deploy instances to a Cloud Service or add new web or worker role instances, Azure allocates compute resources.

在执行这些操作时,甚至在达到 Azure 订阅限制之前,有时可能会收到错误。You may occasionally receive errors during these operations even before you reach the Azure subscription limit.

提示

规划服务的部署时,本信息可能也有用。The information may also be useful when you plan the deployment of your services.

症状Symptom

在 Azure 门户中,导航到云服务(经典),并在侧栏中选择“操作日志(经典)”来查看日志。In Azure portal, navigate to your Cloud service (classic) and in the sidebar select Operation log (classic) to view the logs.

该图显示了“操作日志(经典)”边栏选项卡。

检查云服务(经典)的日志时,你将看到以下异常:When you're inspecting the logs of your Cloud service (classic), you'll see the following exception:

例外Exception 错误消息Error Message
FabricInternalServerErrorFabricInternalServerError 操作失败,错误代码为“InternalError”,errorMessage 为“服务器遇到内部错误。Operation failed with error code 'InternalError' and errorMessage 'The server encountered an internal error. 请重试该请求。”Please retry the request.'.
ServiceAllocationFailureServiceAllocationFailure 操作失败,错误代码为“InternalError”,errorMessage 为“服务器遇到内部错误。Operation failed with error code 'InternalError' and errorMessage 'The server encountered an internal error. 请重试该请求。”Please retry the request.'.

原因Cause

FabricInternalServerError 和 ServiceAllocationFailure 是在结构控制器无法在群集中分配实例时可能发生的异常。FabricInternalServerError and ServiceAllocationFailure are exceptions that can occur when the fabric controller fails to allocate instances in the cluster. 如果云服务已固定或未固定,根本原因会有所不同。The root cause varies if the cloud service is pinned or not pinned.

备注

将第一个实例部署到云服务时(不管是部署到过渡环境还是生产环境),都会将该云服务固定到某个群集。When the first instance is deployed to a cloud service (in either staging or production), that cloud service gets pinned to a cluster.

随着时间的推移,此资源池中的资源可能会被完全利用。Over time, the resources in this resource pool may become fully utilized. 当固定的资源池中没有足够的资源可用时,如果云服务发出分配请求来请求更多资源,则该请求将导致分配失败If a cloud service makes an allocation request for additional resources when insufficient resources are available in the pinned resource pool, the request will result in an allocation failure.

解决方案Solution

在以下情况下,请按照有关分配失败的指导进行操作。Follow the guidance for allocation failures in the following scenarios.

未固定到群集Not pinned to a cluster

首次部署云服务(经典)时,尚未选择群集,因此未固定云服务。The first time you deploy a Cloud service (classic), the cluster hasn't been selected yet, so the cloud service isn't pinned. Azure 可能出现部署失败,因为:Azure may have a deployment failure because:

  • 你选择了在区域中不可用的特定大小。You've selected a particular size that isn't available in the region.
  • 区域中不提供不同角色所需的大小组合。The combination of sizes that are needed across different roles isn't available in the region.

在这种情况下遇到分配错误时,建议的操作过程是检查区域中的可用大小,并更改之前指定的大小。When you experience an allocation error in this scenario, the recommended course of action is to check the available sizes in the region and change the size you previously specified.

  1. 可以在云服务(经典)产品页上查看区域中可用的大小。You can check the sizes available in a region on the Cloud service (classic) products page.

    备注

    “产品”页不会显示可用容量。The Products page won't show the available capacity. 对于任何新的分配,Azure 应该能够在该时间点选择区域中的最佳群集。For any new allocation, Azure should be able to pick the optimal cluster in your region at that point in time.

  2. 更新云服务(经典)的服务定义文件,以指定区域中的不同产品大小Update the service definition file for your Cloud service (classic) to specify a different product size from your region.

已固定到群集Pinned to a cluster

将现有的云服务固定到群集。Existing cloud services are pinned to a cluster. 云服务(经典)的任何进一步部署都会发生在同一个群集中。Any further deployments for the Cloud service (classic) will happen in the same cluster.

在这种情况下遇到分配错误时,建议的操作过程是重新部署到新的云服务(经典)(并更新 CNAME)。When you experience an allocation error in this scenario, the recommended course of action is to redeploy to a new Cloud service (classic) (and update the CNAME).

提示

这种解决方案可能会最成功,因为它允许平台从该区域的所有群集中进行选择。This solution is likely to be most successful as it allows the platform to choose from all clusters in that region.

备注

此解决方案应该不会导致停机。This solution should incur zero downtime.

  1. 将工作负载部署到新的云服务(经典)。Deploy the workload to a new Cloud service (classic).

    警告

    如果你不希望丢失与此部署槽位关联的 IP 地址,则可以使用解决方案 3 - 保留 IP 地址If you do not want to lose the IP address associated with this deployment slot, you can use Solution 3 - Keep the IP address.

  2. 更新 CNAME 或 A 记录,以将流量指向新的云服务(经典)。Update the CNAME or A record to point traffic to the new Cloud service (classic).

  3. 一旦零流量流向旧站点,就可以删除旧的云服务(经典)。Once zero traffic is going to the old site, you can delete the old Cloud service (classic).

如需更多修正步骤,请参阅排查云服务(经典)分配失败 | Microsoft DocsSee Troubleshooting Cloud service (classic) allocation failures | Microsoft Docs for further remediation steps.

后续步骤Next steps

有关更多分配失败解决方案和背景信息:For more allocation failure solutions and background information:

如果你有本文未解决的 Azure 问题,请访问 MSDN 和 Stack Overflow 上的 Azure 论坛。If your Azure issue isn't addressed in this article, visit the Azure forums on MSDN and Stack Overflow. 可以在这些论坛上发布问题。You can post your issue in these forums. 还可提交 Azure 支持请求。You also can submit an Azure support request. 若要提交支持请求,请在 Azure 支持页上,选择“获取支持”。To submit a support request, on the Azure support page, select Get support.