在 Azure 数据工厂中排查 SSIS Integration Runtime 管理问题Troubleshoot SSIS Integration Runtime Management in Azure Data Factory

本文档提供有关排查 SSIS Integration Runtime (SSIS IR) 管理问题的指导。This document provides troubleshooting guides for management issues of SSIS Integration Runtime (SSIS IR).

概述Overview

如果预配或取消预配 SSIS IR 时出现任何问题,ADF 门户中会显示错误消息,或者 PowerShell cmdlet 会返回错误消息。There will be an error message in the ADF portal or returned from PowerShell cmdlet if there is any issue in provisioning or deprovisioning SSIS IR. 错误始终以错误代码的格式显示,并附带详细的错误消息。The error is always in the format as an error code with a detailed error message.

如果错误代码为 InternalServerError,则表示服务出现了暂时性的问题。It means service has some transient issues if the error code is InternalServerError. 可以稍后重试操作。You may retry the operation later. 如果重试不起作用,请联系 Azure 数据工厂支持团队。Contact Azure Data Factory support team if retry doesn’t help.

有三个主要的外部依赖项可能会导致错误:Azure SQL 数据库服务器或托管实例、自定义安装脚本和虚拟网络配置(如果错误代码不是 InternalServerError)。There are three major external dependencies that may cause errors: Azure SQL Database server or Managed Instance, Custom Setup Script, and Virtual Network Configuration if the error code is not InternalServerError.

Azure SQL 数据库服务器或托管实例问题Azure SQL Database server or managed instance issues

使用 SSIS 目录数据库预配 SSIS IR 时需要 Azure SQL 数据库服务器或托管实例。An Azure SQL Database server or Managed Instance is required if provisioning SSIS IR with SSIS catalog database. SSIS IR 应可访问该 Azure SQL 数据库服务器或托管实例。The Azure SQL Database server or Managed Instance should be accessible by the SSIS IR. Azure SQL 数据库服务器或托管实例的帐户应有权创建 SSIS 目录数据库 (SSISDB)。The account of the Azure SQL Database server or Managed Instance should have the permission to create SSIS Catalog database (SSISDB). 如果出现任何错误,ADF 门户中会显示错误代码以及详细的 SQL 异常消息。If there is any error, an error code with detail SQL exception message will be shown in the ADF portal. 请遵循以下步骤根据错误代码进行故障排除。Follow the steps below to troubleshoot the error codes.

AzureSqlConnectionFailureAzureSqlConnectionFailure

预配新的 SSIS IR 时或运行 IR 期间,可能会出现此问题。You may see this issue when you are provisioning a new SSIS IR or during IR running.

如果在预配 IR 期间出现该错误,则原因可能如下。可以在错误消息中获取详细的 SqlException 消息。It may be caused by following reasons if you see the error during IR provisioning, and you can get detail SqlException message in the error message.

  • 网络连接问题。Network connection issue. 检查 SQL Server 或托管实例主机名是否可访问,并且没有任何防火墙或 NSG 阻止 SSIS IR 访问服务器。Check the SQL Server or Managed Instance host name is accessible, and there is no firewall or NSG blocks SSIS IR to access the server.
  • 在使用 SQL 身份验证的情况下登录失败。Login failed and SQL Authentication is used. 这表示提供的帐户无法登录到 SQL Server。It means the account provided can't log in to the SQL Server. 请确保提供正确的用户帐户。Make sure the correct user account is provided.
  • 在使用 AAD 身份验证(托管标识)的情况下登录失败。Login failed and AAD authentication (Managed Identity) is used. 将工厂的托管标识添加到 AAD 组,使该托管标识有权访问目录数据库服务器。Add the Managed Identity of your factory into an AAD group, and make the Managed Identity has access permissions to your catalog database server.
  • 连接超时。此错误始终由安全相关的配置造成。Connection timeout, it is always because of security-related configuration. 建议创建新的 VM,使 VM 加入 IR 所在的同一 VNet(如果 IR 在 VNet 中),然后安装 SSMS 并检查 Azure SQL 数据库服务器或托管实例的状态。It is recommended to create a new VM, make the VM joining the same VNet of IR if IR is in a VNet, then install SSMS and check the Azure SQL Database server or Managed Instance status.

对于其他问题,请参阅详细的 SQL 异常错误消息,并解决错误消息中显示的问题。For other issues, refer to the detail SQL Exception error message and fix the issue shown in error message. 如果仍然遇到问题,请联系 Azure SQL 数据库服务器或托管实例支持团队。Contact Azure SQL Database server or Managed Instance support team if you’re still having problems.

如果在运行 IR 期间出现错误,原因可能是网络安全组或防火墙发生了一些更改,导致 SSIS IR 工作器节点不再能够访问 Azure SQL 数据库服务器或托管实例。It’s likely there are some Network Security Group or firewall changes if you see the error during IR running, which causes the SSIS IR worker node cannot access the Azure SQL Database server or Managed Instance anymore. 取消阻止 SSIS IR 工作器节点访问 Azure SQL 数据库服务器或托管实例。Unblock the SSIS IR worker node to access the Azure SQL Database server or Managed Instance.

CatalogCapacityLimitErrorCatalogCapacityLimitError

错误消息类似于“The database 'SSISDB' has reached its size quota.The error message is like “The database 'SSISDB' has reached its size quota. Partition or delete data, drop indexes, or consult the documentation for possible resolutions.”Partition or delete data, drop indexes, or consult the documentation for possible resolutions.” 可能的解决方法包括:The possible solutions are:

  • 提高 SSISDB 的大小配额。Increase size quota of your SSISDB.
  • 更改 SSISDB 的配置以减小大小,例如:Change the configurations of SSISDB to reduce the size like:
    • 缩短保留期,减少项目版本数。Reducing the retention period and number of project versions.
    • 缩短日志的保留期。Reducing the retention period of log.
    • 更改日志的默认级别,等等。Changing the default level of the log and so on.

CatalogDbBelongsToAnotherIRCatalogDbBelongsToAnotherIR

此错误表示 Azure SQL 数据库服务器或托管实例中已创建一个由其他 IR 使用的 SSISDB。This error means the Azure SQL Database server or Managed Instance already has an SSISDB created and used by another IR. 需要提供不同的 Azure SQL 数据库服务器或托管实例,或者删除现有 SSISDB 并重启新的 IR。You need either provide a different Azure SQL Database server or Managed Instance, or delete existing SSISDB and restart the new IR.

CatalogDbCreationFailureCatalogDbCreationFailure

此错误可能由以下原因导致:This error could be caused by below reasons,

  • 为 SSIS IR 配置的用户帐户无权创建数据库。The user account that is configured for the SSIS IR has no permission to create the database. 可向该用户授予创建数据库的权限。You can grant the user to have permission to create the database.
  • 创建数据库超时,例如执行超时、数据库操作超时,等等。Create database timeout like execution timeout, DB operation timeout and so on. 可以稍后重试,以查看问题是否已解决。You can retry later to see whether the issue is solved. 如果重试不起作用,请联系 Azure SQL 数据库服务器或托管实例支持团队。Contact the Azure SQL Database server or Managed Instance support team if retry doesn’t work.

对于其他问题,请查看 SQL 异常错误消息,并解决错误消息中提到的问题。For other issues, check the SQL Exception error message and fix the issue mentioned in error message. 如果仍然遇到问题,请联系 Azure SQL 数据库服务器或托管实例支持团队。If you’re still having problems, contact Azure SQL Database server or Managed Instance support team.

InvalidCatalogDbInvalidCatalogDb

错误消息类似于“Invalid object name 'catalog.catalog_properties'.”,这表示你已有一个名为 SSISDB 的数据库,但该数据库不是由 SSIS IR 创建,或者上次预配 SSIS IR 时出现的错误导致该数据库处于无效状态。The error message is like “Invalid object name 'catalog.catalog_properties'.”, it means either you already have a database named SSISDB but it’s not created by SSIS IR, or the database is in invalid state that is caused by errors in last SSIS IR provisioning. 可以删除名为 SSISDB 的现有数据库,或者为 IR 配置新的 Azure SQL 数据库服务器或托管实例。You can drop existing database with the name SSISDB, or configure a new Azure SQL Database server or Managed Instance for the IR.

自定义安装Custom setup

自定义安装提供一个界面,用于在预配或重新配置 SSIS IR 期间添加你自己的设置步骤。Custom Setup provides an interface to add your own setup steps during the provisioning or reconfiguration of your SSIS IR. 有关详细信息,请参阅 Azure-SSIS Integration Runtime 的自定义安装For more information, see Customize setup for the Azure-SSIS integration runtime.

确保容器中只包含必要的自定义安装文件,因为容器中的所有文件将下载到 SSIS IR 工作器节点。Ensure your container contains only the necessary custom setup files, as all the files in the container will be downloaded onto the SSIS IR worker node. 建议在本地计算机上测试自定义安装脚本并解决任何脚本执行问题,然后在 SSIS IR 中运行该脚本。It’s recommended to test the custom setup script on a local machine to fix any script execution issues before running the script in SSIS IR.

在运行 IR 期间也会检查自定义安装脚本容器,因为 SSIS IR 会定期更新,这需要再次访问容器以下载自定义安装脚本并再次安装。The custom setup script container will be checked during IR running too as SSIS IR is regular updated which need to access the container again to download the custom setup script and install again. 检查项目包括容器是否可访问,以及 main.cmd 文件是否存在。The check will include whether the container is accessible and whether the main.cmd file exists.

如果使用自定义安装时出现任何错误,你将看到代码为 CustomSetupScriptFailure 的错误。请检查具有附属错误代码的错误消息。Any error with custom setup, you will see an error with code CustomSetupScriptFailure, check the error message that has a sub error code. 请遵循以下步骤根据附属错误代码进行故障排除。Follow the steps below to troubleshoot the sub error codes.

CustomSetupScriptBlobContainerInaccessibleCustomSetupScriptBlobContainerInaccessible

这表示 SSIS IR 无法访问自定义安装的 Azure Blob 容器。It means SSIS IR cannot access your Azure blob container for custom setup. 检查容器的 SAS URI 是否可访问且未过期。Check the SAS URI of the container is reachable and not expired.

如果 IR 处于运行中状态,请先停止 IR,使用新的自定义安装容器 SAS URI 重新配置 IR,然后再次启动 IR。Stop the IR first if the IR is in running state, reconfigure the IR with new custom setup container SAS URI and then start the IR again.

CustomSetupScriptNotFoundCustomSetupScriptNotFound

这表示 SSIS IR 在 Blob 容器中找不到自定义安装脚本 (main.cmd)。It means SSIS IR cannot find custom setup script (main.cmd) in your blob container. 确保容器中存在 main.cmd(自定义安装程序的入口点)。Make sure main.cmd exists in the container, which is the entry point for custom setup installation.

CustomSetupScriptExecutionFailureCustomSetupScriptExecutionFailure

这表示执行自定义安装脚本 (main.cmd) 失败。可以先在本地计算机上尝试执行该脚本,或者检查 Blob 容器中的自定义安装执行日志。It means the execution of custom setup script (main.cmd) failed, you can try the script on your local machine first or check custom setup execution logs in your blob container.

CustomSetupScriptTimeoutCustomSetupScriptTimeout

这表示执行自定义安装脚本超时。It means execute custom setup script timeout. 确保 Blob 容器只包含必要的自定义安装文件。Ensure that your blob container contains only the necessary custom setup files. 还可以检查 Blob 容器中的自定义安装执行日志。You can also check custom setup execution logs in your blob container. 自定义安装的最大期限设置为 45 分钟(45 分钟后将会超时),这包括从容器下载所有文件并将其安装在 Azure-SSIS IR 上的时间。The maximum period for custom setup is set at 45 minutes before it times out and the maximum period includes the time to download all files from your container and install them on SSIS IR. 如果需要延长期限,请提交支持票证。If a longer period is needed, raise a support ticket.

CustomSetupScriptLogUploadFailureCustomSetupScriptLogUploadFailure

这表示将自定义安装执行日志上传到 Blob 容器失败,原因是 SSIS IR 对 Blob 容器没有写入权限,或者存储或网络出现了一些问题。It means uploading custom setup execution logs to your blob container failed, it is either because of SSIS IR has no write permission to your blob container, or some storage or network issues. 如果自定义安装成功,则此错误不会影响任何 SSIS 功能,但日志会缺失。If custom setup is successful, this error does not impact any SSIS function, but logs are missing. 如果自定义安装失败并出现另一种错误,并且我们无法上传日志,则我们首先会报告此错误以便可以上传日志进行分析,解决此问题后,我们将报告其他指定的问题。If custom setup failed with other error, and we fail to upload log, we will report this error first so log can be uploaded for analysis and after this issue is resolved, we will report more specified issue. 如果重试后未解决此问题,请联系 Azure 数据工厂支持团队。If this issue is not solved after retry, contact Azure Data Factory support team.

虚拟网络配置Virtual network configuration

将 SSIS IR 加入虚拟网络 (VNet) 时,SSIS IR 将使用用户订阅下的 VNet。When joining SSIS IR into a Virtual Network (VNet), it uses the VNet under user subscription. 有关详细信息,请参阅将 Azure-SSIS Integration Runtime 加入虚拟网络For more information, see Join an Azure-SSIS integration runtime to a virtual network.

出现 VNet 相关的问题时,会显示如下错误When there is VNet related issue, you will see error as below

InvalidVnetConfigurationInvalidVnetConfiguration

此错误可能由多种不同的原因造成。It could be caused by variant reasons. 请遵循以下步骤根据附属错误代码进行故障排除。Follow the steps below to troubleshoot the sub error codes.

禁止Forbidden

错误消息类似于“未为当前帐户启用 subnetId。The error message is like “subnetId is not enabled for current account. Microsoft.Batch 资源提供程序未在 VNet 的同一订阅下注册。”Microsoft.Batch resource provider is not registered under the same subscription of VNet.”

这表示 Azure Batch 无法访问你的 VNet。It means Azure Batch cannot access your VNet. 请在 VNet 的同一订阅下注册 Microsoft.Batch 资源提供程序。Register Microsoft.Batch resource provider under the same subscription of VNet.

InvalidPropertyValueInvalidPropertyValue

错误消息类似于“Either the specified VNet does not exist, or the Batch service does not have access to it”或“The specified subnet xxx does not exist”。The error message is like “Either the specified VNet does not exist, or the Batch service does not have access to it” or “The specified subnet xxx does not exist”.

这表示 VNet 不存在,或 Azure Batch 服务无法访问它,或提供的子网不存在。It means the VNet does not exist or Azure Batch service cannot access it, or the subnet provided does not exist. 确保 VNet 和子网存在,并且 Azure Batch 可以访问它们。Make sure the VNet and subnet exist and Azure Batch can access them.

MisconfiguredDnsServerOrNsgSettingsMisconfiguredDnsServerOrNsgSettings

消息类似于“Failed to provision Integration Runtime in Vnet.The message is like “Failed to provision Integration Runtime in Vnet. If the DNS server or NSG settings are configured, make sure the DNS server is accessible and NSG is configured properly”If the DNS server or NSG settings are configured, make sure the DNS server is accessible and NSG is configured properly”

可能是你对 DNS 服务器或 NSG 设置使用了某种自定义配置,导致 SSIS IR 所需的 Azure 服务器名称无法解析或不可访问。It’s likely you have some customized configuration of DNS server or NSG settings, which cause Azure Server name required by SSIS IR cannot be resolved or cannot be accessed. 有关详细信息,请参阅 SSIS IR VNet 配置文档。For more information, see SSIS IR VNet configuration document. 如果仍然遇到问题,请联系 Azure 数据工厂支持团队。If you’re still having problems, contact Azure Data Factory support team.

VNetResourceGroupLockedDuringUpgradeVNetResourceGroupLockedDuringUpgrade

SSIS IR 将定期自动更新,在升级过程中会创建新的 Azure Batch 池,删除旧的 Azure Batch 池,删除旧池的 VNet 相关资源,并在订阅下创建新的 VNet 相关资源。SSIS IR will be automatically updated in a regular basis, and a new Azure Batch pool is created during upgrade and old Azure Batch pool will be deleted, VNet related resource for old pool will be deleted, and new VNet related resource will be created under your subscription. 此错误表示由于存在订阅或资源组级别的删除锁,因此删除旧池的 VNet 相关资源失败。This error means deleting VNet related resource for old pool failed because of delete lock at subscription or resource group level. 请帮助删除删除锁。Help to remove the delete lock.

VNetResourceGroupLockedDuringStartVNetResourceGroupLockedDuringStart

SSIS IR 预配可能因某种原因而失败,如果发生失败,创建的所有资源将被删除。SSIS IR provisioning could be fail because of kinds of reason, and if a failure happens, all the resources created will be deleted. 但是,由于存在订阅或资源组级别的资源删除锁,删除 VNet 资源失败。However, VNet resources are failed to be deleted because of there is resource delete lock at subscription or resource group level. 删除删除锁并重启 IR。Remove the delete lock and restart the IR.

VNetResourceGroupLockedDuringStopVNetResourceGroupLockedDuringStop

停止 SSIS IR 时,将删除与 VNet 相关的所有资源,但由于存在订阅或资源组级别的资源删除锁,因此删除操作失败。When stopping SSIS IR, all the resource related to VNet will be deleted, but the deletion failed because of there is resource delete lock at subscription or resource group level. 请帮助删除删除锁并重试停止。Help to remove the delete lock and try the stop again.

NodeUnavailableNodeUnavailable

此错误是在运行 IR 期间发生的,表示 IR 过去正常,但现在不正常,原因始终是 DNS 服务器或 NSG 配置发生更改,导致 SSIS IR 无法连接到依赖的服务。请帮助解决 DNS 服务器或 NSG 配置问题。有关详细信息,请参阅 SSIS IR VNet 配置This error occurs during IR running, it means IR is health before and become unhealthy, it is always because of the DNS Server or NSG configuration changed and cause SSIS IR cannot connect to depended service, help to fix the DNS Server or NSG configuration issues, For more information, see SSIS IR VNet configuration. 如果仍然遇到问题,请联系 Azure 数据工厂支持团队。If you’re still having problems, contact Azure Data Factory support team.