使用 Azure 数据工厂在 Azure SQL 数据库中复制和转换数据Copy and transform data in Azure SQL Database by using Azure Data Factory

适用于: Azure 数据工厂 Azure Synapse Analytics(预览版)

本文概述了如何使用 Azure 数据工厂中的复制活动从/向 Azure SQL 数据库复制数据。This article outlines how to use Copy Activity in Azure Data Factory to copy data from and to Azure SQL Database. 若要了解 Azure 数据工厂,请阅读介绍性文章To learn about Azure Data Factory, read the introductory article.

支持的功能Supported capabilities

以下活动支持此 Azure SQL 数据库连接器:This Azure SQL Database connector is supported for the following activities:

对于复制活动,此 Azure SQL 数据库连接器支持以下功能:For Copy activity, this Azure SQL Database connector supports these functions:

  • 将 SQL 身份验证和 Azure Active Directory (Azure AD) 应用程序令牌身份验证与服务主体或 Azure 资源的托管标识配合使用来复制数据。Copying data by using SQL authentication and Azure Active Directory (Azure AD) Application token authentication with a service principal or managed identities for Azure resources.
  • 作为源,使用 SQL 查询或存储过程检索数据。As a source, retrieving data by using a SQL query or a stored procedure. 你还可以选择从 Azure SQL 数据库源进行并行复制。有关详细信息,请参阅从 SQL 数据库进行并行复制部分。You can also choose to parallel copy from an Azure SQL Database source, see the Parallel copy from SQL database section for details.
  • 作为接收器,根据源架构自动创建目标表(如果不存在);在复制过程中,将数据追加到表或使用自定义逻辑调用存储过程。As a sink, automatically creating destination table if not exists based on the source schema; appending data to a table or invoking a stored procedure with custom logic during the copy.

如果使用 Azure SQL 数据库无服务器层,请注意,当服务器暂停时,活动运行将失败,而不是等待自动恢复就绪。If you use Azure SQL Database serverless tier, note when the server is paused, activity run fails instead of waiting for the auto resume to be ready. 可以添加活动重试或链接其他活动,以确保服务器在实际执行时处于活动状态。You can add activity retry or chain additional activities to make sure the server is live upon the actual execution.

备注

目前此连接器不支持 Azure SQL 数据库 Always EncryptedAzure SQL Database Always Encrypted isn't supported by this connector now. 为了解决此问题,可以通过自承载 Integration Runtime 使用泛型 ODBC 连接器和 SQL Server ODBC 驱动程序。To work around, you can use a generic ODBC connector and a SQL Server ODBC driver via a self-hosted integration runtime. 使用 Always Encrypted 部分了解更多信息。Learn more from Using Always Encrypted section.

重要

如果使用 Azure 集成运行时复制数据,请配置服务器级防火墙规则,以便 Azure 服务可以访问服务器。If you copy data by using the Azure integration runtime, configure a server-level firewall rule so that Azure services can access the server. 如果使用自承载集成运行时复制数据,请将防火墙配置为允许合适的 IP 范围。If you copy data by using a self-hosted integration runtime, configure the firewall to allow the appropriate IP range. 此范围包括用于连接 Azure SQL 数据库的计算机的 IP。This range includes the machine's IP that's used to connect to Azure SQL Database.

入门Get started

若要使用管道执行复制活动,可以使用以下工具或 SDK 之一:To perform the Copy activity with a pipeline, you can use one of the following tools or SDKs:

对于特定于 Azure SQL 数据库连接器的 Azure 数据工厂实体,以下部分提供有关用于定义这些实体的属性的详细信息。The following sections provide details about properties that are used to define Azure Data Factory entities specific to an Azure SQL Database connector.

链接服务属性Linked service properties

Azure SQL 数据库链接服务支持以下属性:These properties are supported for an Azure SQL Database linked service:

属性Property 说明Description 必须Required
typetype type 属性必须设置为 AzureSqlDatabase 。The type property must be set to AzureSqlDatabase. Yes
connectionStringconnectionString 为 connectionString 属性指定连接到 Azure SQL 数据库实例所需的信息。Specify information needed to connect to the Azure SQL Database instance for the connectionString property.
还可以将密码或服务主体密钥放在 Azure Key Vault 中。You also can put a password or service principal key in Azure Key Vault. 如果使用 SQL 身份验证,请从连接字符串中提取 password 配置。If it's SQL authentication, pull the password configuration out of the connection string. 有关详细信息,请参阅表格后面的 JSON 示例,以及在 Azure Key Vault 中存储凭据For more information, see the JSON example following the table and Store credentials in Azure Key Vault.
Yes
servicePrincipalIdservicePrincipalId 指定应用程序的客户端 ID。Specify the application's client ID. 是,将 Azure AD 身份验证与服务主体配合使用时是必需的Yes, when you use Azure AD authentication with a service principal
servicePrincipalKeyservicePrincipalKey 指定应用程序的密钥。Specify the application's key. 将此字段标记为 SecureString ,以安全地将其存储在 Azure 数据工厂中或 引用存储在 Azure Key Vault 中的机密Mark this field as SecureString to store it securely in Azure Data Factory or reference a secret stored in Azure Key Vault. 是,将 Azure AD 身份验证与服务主体配合使用时是必需的Yes, when you use Azure AD authentication with a service principal
tenanttenant 指定应用程序所在的租户的信息(例如域名或租户 ID)。Specify the tenant information, like the domain name or tenant ID, under which your application resides. 将鼠标悬停在 Azure 门户右上角进行检索。Retrieve it by hovering the mouse in the upper-right corner of the Azure portal. 是,将 Azure AD 身份验证与服务主体配合使用时是必需的Yes, when you use Azure AD authentication with a service principal
azureCloudTypeazureCloudType 对于服务主体身份验证,请指定 Azure AD 应用程序注册到的 Azure 云环境的类型。For service principal authentication, specify the type of Azure cloud environment to which your Azure AD application is registered.
允许的值为“AzureChina”。Allowed value is AzureChina. 默认使用数据工厂的云环境。By default, the data factory's cloud environment is used.
No
connectViaconnectVia 集成运行时用于连接到数据存储。This integration runtime is used to connect to the data store. 可使用 Azure Integration Runtime 或自承载集成运行时(如果数据存储位于专用网络)。You can use the Azure integration runtime or a self-hosted integration runtime if your data store is located in a private network. 如果未指定,则使用默认 Azure Integration Runtime。If not specified, the default Azure integration runtime is used. No

有关各种身份验证类型,请参阅关于先决条件和 JSON 示例的以下各部分:For different authentication types, refer to the following sections on prerequisites and JSON samples, respectively:

提示

如果遇到错误(错误代码为“UserErrorFailedToConnectToSqlServer”,且消息如“数据库的会话限制为 XXX 且已达到。”),请将 Pooling=false 添加到连接字符串中,然后重试。If you hit an error with the error code "UserErrorFailedToConnectToSqlServer" and a message like "The session limit for the database is XXX and has been reached," add Pooling=false to your connection string and try again.

SQL 身份验证SQL authentication

示例:使用 SQL 身份验证Example: using SQL authentication

{
    "name": "AzureSqlDbLinkedService",
    "properties": {
        "type": "AzureSqlDatabase",
        "typeProperties": {
            "connectionString": "Data Source=tcp:<servername>.database.chinacloudapi.cn,1433;Initial Catalog=<databasename>;User ID=<username>@<servername>;Password=<password>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30"
        },
        "connectVia": {
            "referenceName": "<name of Integration Runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

示例:Azure Key Vault 中的密码Example: password in Azure Key Vault

{
    "name": "AzureSqlDbLinkedService",
    "properties": {
        "type": "AzureSqlDatabase",
        "typeProperties": {
            "connectionString": "Data Source=tcp:<servername>.database.chinacloudapi.cn,1433;Initial Catalog=<databasename>;User ID=<username>@<servername>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30",
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "<Azure Key Vault linked service name>",
                    "type": "LinkedServiceReference"
                },
                "secretName": "<secretName>"
            }
        },
        "connectVia": {
            "referenceName": "<name of Integration Runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

服务主体身份验证Service principal authentication

要使用基于服务主体的 Azure AD 应用程序令牌身份验证,请执行以下步骤:To use a service principal-based Azure AD application token authentication, follow these steps:

  1. 从 Azure 门户创建 Azure Active Directory 应用程序Create an Azure Active Directory application from the Azure portal. 记下应用程序名称,以及以下定义链接服务的值:Make note of the application name and the following values that define the linked service:

    • 应用程序 IDApplication ID
    • 应用程序密钥Application key
    • 租户 IDTenant ID
  2. 为 Azure 门户上的服务器预配 Azure Active Directory 管理员(如果尚未这样做)。Provision an Azure Active Directory administrator for your server on the Azure portal if you haven't already done so. Azure AD 管理员必须是 Azure AD 用户或 Azure AD 组,但不能是服务主体。The Azure AD administrator must be an Azure AD user or Azure AD group, but it can't be a service principal. 执行此步骤后,在下一步骤中便可使用 Azure AD 标识来为服务主体创建包含的数据库用户。This step is done so that, in the next step, you can use an Azure AD identity to create a contained database user for the service principal.

  3. 为服务主体创建包含的数据库用户Create contained database users for the service principal. 使用 SQL Server Management Studio 等工具和至少具有 ALTER ANY USER 权限的 Azure AD 标识连接到要向其/从中复制数据的数据库。Connect to the database from or to which you want to copy data by using tools like SQL Server Management Studio, with an Azure AD identity that has at least ALTER ANY USER permission. 运行以下 T-SQL:Run the following T-SQL:

    CREATE USER [your application name] FROM EXTERNAL PROVIDER;
    
  4. 像通常对 SQL 用户或其他用户所做的那样向服务主体授予所需的权限。Grant the service principal needed permissions as you normally do for SQL users or others. 运行以下代码。Run the following code. 有关更多选项,请参阅此文档For more options, see this document.

    ALTER ROLE [role name] ADD MEMBER [your application name];
    
  5. 在 Azure 数据工厂中配置 Azure SQL 数据库链接服务。Configure an Azure SQL Database linked service in Azure Data Factory.

使用服务主体身份验证的链接服务示例Linked service example that uses service principal authentication

{
    "name": "AzureSqlDbLinkedService",
    "properties": {
        "type": "AzureSqlDatabase",
        "typeProperties": {
            "connectionString": "Data Source=tcp:<servername>.database.chinacloudapi.cn,1433;Initial Catalog=<databasename>;Connection Timeout=30",
            "servicePrincipalId": "<service principal id>",
            "servicePrincipalKey": {
                "type": "SecureString",
                "value": "<service principal key>"
            },
            "tenant": "<tenant info, e.g. microsoft.partner.onmschina.cn>"
        },
        "connectVia": {
            "referenceName": "<name of Integration Runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

Azure 资源的托管标识身份验证Managed identities for Azure resources authentication

可将数据工厂与代表此特定数据工厂的 Azure 资源托管标识相关联。A data factory can be associated with a managed identity for Azure resources that represents the specific data factory. 可将此托管标识用于 Azure SQL 数据库身份验证。You can use this managed identity for Azure SQL Database authentication. 指定的工厂可以使用此标识访问数据库数据或从/向数据库复制数据。The designated factory can access and copy data from or to your database by using this identity.

若要使用托管标识身份验证,请执行以下步骤:To use managed identity authentication, follow these steps.

  1. 为 Azure 门户上的服务器预配 Azure Active Directory 管理员(如果尚未这样做)。Provision an Azure Active Directory administrator for your server on the Azure portal if you haven't already done so. Azure AD 管理员可以是 Azure AD 用户,也可以是 Azure AD 组。The Azure AD administrator can be an Azure AD user or an Azure AD group. 如果授予包含托管标识的组管理员角色,则可跳过步骤 3 和步骤 4。If you grant the group with managed identity an admin role, skip steps 3 and 4. 管理员拥有对数据库的完全访问权限。The administrator has full access to the database.

  2. 为 Azure 数据工厂托管标识创建包含的数据库用户Create contained database users for the Azure Data Factory managed identity. 使用 SQL Server Management Studio 等工具和至少具有 ALTER ANY USER 权限的 Azure AD 标识连接到要向其/从中复制数据的数据库。Connect to the database from or to which you want to copy data by using tools like SQL Server Management Studio, with an Azure AD identity that has at least ALTER ANY USER permission. 运行以下 T-SQL:Run the following T-SQL:

    CREATE USER [your Data Factory name] FROM EXTERNAL PROVIDER;
    
  3. 像通常对 SQL 用户和其他用户所做的那样向数据工厂托管标识授予所需的权限。Grant the Data Factory managed identity needed permissions as you normally do for SQL users and others. 运行以下代码。Run the following code. 有关更多选项,请参阅此文档For more options, see this document.

    ALTER ROLE [role name] ADD MEMBER [your Data Factory name];
    
  4. 在 Azure 数据工厂中配置 Azure SQL 数据库链接服务。Configure an Azure SQL Database linked service in Azure Data Factory.

示例Example

{
    "name": "AzureSqlDbLinkedService",
    "properties": {
        "type": "AzureSqlDatabase",
        "typeProperties": {
            "connectionString": "Data Source=tcp:<servername>.database.chinacloudapi.cn,1433;Initial Catalog=<databasename>;Connection Timeout=30"
        },
        "connectVia": {
            "referenceName": "<name of Integration Runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

数据集属性Dataset properties

有关可用于定义数据集的各个节和属性的完整列表,请参阅数据集For a full list of sections and properties available to define datasets, see Datasets.

Azure SQL 数据库数据集支持以下属性:The following properties are supported for Azure SQL Database dataset:

属性Property 说明Description 必须Required
typetype 数据集的 type 属性必须设置为 AzureSqlTable 。The type property of the dataset must be set to AzureSqlTable. Yes
架构schema 架构的名称。Name of the schema. 对于源为“No”,对于接收器为“Yes”No for source, Yes for sink
table 表/视图的名称。Name of the table/view. 对于源为“No”,对于接收器为“Yes”No for source, Yes for sink
tableNametableName 具有架构的表/视图的名称。Name of the table/view with schema. 此属性支持后向兼容性。This property is supported for backward compatibility. 对于新的工作负荷,请使用 schematableFor new workload, use schema and table. 对于源为“No”,对于接收器为“Yes”No for source, Yes for sink

数据集属性示例Dataset properties example

{
    "name": "AzureSQLDbDataset",
    "properties":
    {
        "type": "AzureSqlTable",
        "linkedServiceName": {
            "referenceName": "<Azure SQL Database linked service name>",
            "type": "LinkedServiceReference"
        },
        "schema": [ < physical schema, optional, retrievable during authoring > ],
        "typeProperties": {
            "schema": "<schema_name>",
            "table": "<table_name>"
        }
    }
}

复制活动属性Copy activity properties

有关可用于定义活动的各个部分和属性的完整列表,请参阅管道For a full list of sections and properties available for defining activities, see Pipelines. 本部分提供 Azure SQL 数据库源和接收器支持的属性列表。This section provides a list of properties supported by the Azure SQL Database source and sink.

Azure SQL 数据库作为源Azure SQL Database as the source

提示

若要详细了解如何使用数据分区从 Azure SQL 数据库高效加载数据,请参阅从 SQL 数据库进行并行复制To load data from Azure SQL Database efficiently by using data partitioning, learn more from Parallel copy from SQL database.

若要从 Azure SQL 数据库复制数据,复制活动的 source 节需要支持以下属性:To copy data from Azure SQL Database, the following properties are supported in the copy activity source section:

属性Property 说明Description 必须Required
typetype 复制活动源的 type 属性必须设置为 AzureSqlSourceThe type property of the copy activity source must be set to AzureSqlSource. 为了向后兼容,仍然支持“SqlSource”类型。"SqlSource" type is still supported for backward compatibility. Yes
sqlReaderQuerysqlReaderQuery 此属性使用自定义 SQL 查询来读取数据。This property uses the custom SQL query to read data. 例如 select * from MyTableAn example is select * from MyTable. No
sqlReaderStoredProcedureNamesqlReaderStoredProcedureName 从源表读取数据的存储过程的名称。The name of the stored procedure that reads data from the source table. 最后一个 SQL 语句必须是存储过程中的 SELECT 语句。The last SQL statement must be a SELECT statement in the stored procedure. No
storedProcedureParametersstoredProcedureParameters 存储过程的参数。Parameters for the stored procedure.
允许的值为名称或值对。Allowed values are name or value pairs. 参数的名称和大小写必须与存储过程参数的名称和大小写匹配。The names and casing of parameters must match the names and casing of the stored procedure parameters.
No
isolationLevelisolationLevel 指定 SQL 源的事务锁定行为。Specifies the transaction locking behavior for the SQL source. 允许的值为:ReadCommitted、ReadUncommitted、RepeatableRead、Serializable、Snapshot 。The allowed values are: ReadCommitted , ReadUncommitted , RepeatableRead , Serializable , Snapshot. 如果未指定,则使用数据库的默认隔离级别。If not specified, the database's default isolation level is used. 请参阅此文档了解更多详细信息。Refer to this doc for more details. No
partitionOptionspartitionOptions 指定用于从 Azure SQL 数据库加载数据的数据分区选项。Specifies the data partitioning options used to load data from Azure SQL Database.
允许值包括:None(默认值)、PhysicalPartitionsOfTable 和 DynamicRange 。Allowed values are: None (default), PhysicalPartitionsOfTable , and DynamicRange.
启用分区选项(即,该选项不为 None)时,用于从 Azure SQL 数据库并行加载数据的并行度由复制活动上的 parallelCopies 设置控制。When a partition option is enabled (that is, not None), the degree of parallelism to concurrently load data from an Azure SQL Database is controlled by the parallelCopies setting on the copy activity.
No
partitionSettingspartitionSettings 指定数据分区的设置组。Specify the group of the settings for data partitioning.
当分区选项不是 None 时适用。Apply when the partition option isn't None.
No
partitionSettings 下: _**Under partitionSettings: _
partitionColumnNamepartitionColumnName 指定并行复制范围分区将使用的源列(整数类型或者日期或日期/时间类型)的名称。Specify the name of the source column _ in integer or date/datetime type * that will be used by range partitioning for parallel copy. 如果未指定,系统会自动检测表的索引或主键并将其用作分区列。If not specified, the index or the primary key of the table is auto-detected and used as the partition column.
当分区选项是 DynamicRange 时适用。Apply when the partition option is DynamicRange. 如果使用查询来检索源数据,请在 WHERE 子句中挂接 ?AdfDynamicRangePartitionCondition If you use a query to retrieve the source data, hook ?AdfDynamicRangePartitionCondition in the WHERE clause. 有关示例,请参阅从 SQL 数据库进行并行复制部分。For an example, see the Parallel copy from SQL database section.
No
partitionUpperBoundpartitionUpperBound 分区范围拆分的分区列的最大值。The maximum value of the partition column for partition range splitting. 此值用于决定分区步幅,不用于筛选表中的行。This value is used to decide the partition stride, not for filtering the rows in table. 将对表或查询结果中的所有行进行分区和复制。All rows in the table or query result will be partitioned and copied. 如果未指定,复制活动会自动检测该值。If not specified, copy activity auto detect the value.
当分区选项是 DynamicRange 时适用。Apply when the partition option is DynamicRange. 有关示例,请参阅从 SQL 数据库进行并行复制部分。For an example, see the Parallel copy from SQL database section.
No
partitionLowerBoundpartitionLowerBound 分区范围拆分的分区列的最小值。The minimum value of the partition column for partition range splitting. 此值用于决定分区步幅,不用于筛选表中的行。This value is used to decide the partition stride, not for filtering the rows in table. 将对表或查询结果中的所有行进行分区和复制。All rows in the table or query result will be partitioned and copied. 如果未指定,复制活动会自动检测该值。If not specified, copy activity auto detect the value.
当分区选项是 DynamicRange 时适用。Apply when the partition option is DynamicRange. 有关示例,请参阅从 SQL 数据库进行并行复制部分。For an example, see the Parallel copy from SQL database section.
No

需要注意的要点:Points to note:

  • 如果为 AzureSqlSource 指定 sqlReaderQuery ,则复制活动针对 Azure SQL 数据库源运行此查询可获取数据。If sqlReaderQuery is specified for AzureSqlSource , the copy activity runs this query against the Azure SQL Database source to get the data. 也可通过指定 sqlReaderStoredProcedureName 和 storedProcedureParameters 来指定存储过程,前提是存储过程使用参数 。You also can specify a stored procedure by specifying sqlReaderStoredProcedureName and storedProcedureParameters if the stored procedure takes parameters.
  • 如果不指定 sqlReaderQuerysqlReaderStoredProcedureName ,则数据集 JSON 的“structure”节中定义的列用于构建查询。If you don't specify either sqlReaderQuery or sqlReaderStoredProcedureName , the columns defined in the "structure" section of the dataset JSON are used to construct a query. 查询 select column1, column2 from mytable 针对 Azure SQL 数据库运行。The query select column1, column2 from mytable runs against Azure SQL Database. 如果数据集定义没有“structure”,则会从表中选择所有列。If the dataset definition doesn't have "structure," all columns are selected from the table.

SQL 查询示例SQL query example

"activities":[
    {
        "name": "CopyFromAzureSQLDatabase",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<Azure SQL Database input dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<output dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "AzureSqlSource",
                "sqlReaderQuery": "SELECT * FROM MyTable"
            },
            "sink": {
                "type": "<sink type>"
            }
        }
    }
]

存储过程示例Stored procedure example

"activities":[
    {
        "name": "CopyFromAzureSQLDatabase",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<Azure SQL Database input dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<output dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "AzureSqlSource",
                "sqlReaderStoredProcedureName": "CopyTestSrcStoredProcedureWithParameters",
                "storedProcedureParameters": {
                    "stringData": { "value": "str3" },
                    "identifier": { "value": "$$Text.Format('{0:yyyy}', <datetime parameter>)", "type": "Int"}
                }
            },
            "sink": {
                "type": "<sink type>"
            }
        }
    }
]

存储过程定义Stored procedure definition

CREATE PROCEDURE CopyTestSrcStoredProcedureWithParameters
(
    @stringData varchar(20),
    @identifier int
)
AS
SET NOCOUNT ON;
BEGIN
     select *
     from dbo.UnitTestSrcTable
     where dbo.UnitTestSrcTable.stringData != stringData
    and dbo.UnitTestSrcTable.identifier != identifier
END
GO

Azure SQL 数据库作为接收器Azure SQL Database as the sink

提示

若要详细了解支持的写入行为、配置和最佳做法,请参阅有关将数据载入 Azure SQL 数据库的最佳做法Learn more about the supported write behaviors, configurations, and best practices from Best practice for loading data into Azure SQL Database.

将数据复制到 Azure SQL 数据库时,复制活动的 sink 节支持以下属性:To copy data to Azure SQL Database, the following properties are supported in the copy activity sink section:

属性Property 说明Description 必须Required
typetype 复制活动接收器的 type 属性必须设置为 AzureSqlSinkThe type property of the copy activity sink must be set to AzureSqlSink. 为了向后兼容,仍然支持“SqlSink”类型。"SqlSink" type is still supported for backward compatibility. Yes
preCopyScriptpreCopyScript 将数据写入到 Azure SQL 数据库之前,指定复制活动要运行的 SQL 查询。Specify a SQL query for the copy activity to run before writing data into Azure SQL Database. 每次运行复制仅调用该查询一次。It's invoked only once per copy run. 使用此属性清理预加载的数据。Use this property to clean up the preloaded data. No
tableOptiontableOption 指定是否根据源架构自动创建接收器表(如果不存在)。Specifies whether to automatically create the sink table if not exists based on the source schema.
接收器指定存储过程时不支持自动创建表。Auto table creation is not supported when sink specifies stored procedure.
允许的值为:none(默认值)、autoCreateAllowed values are: none (default), autoCreate.
No
sqlWriterStoredProcedureNamesqlWriterStoredProcedureName 定义如何将源数据应用于目标表的存储过程的名称。The name of the stored procedure that defines how to apply source data into a target table.
此存储过程由每个批处理调用。This stored procedure is invoked per batch. 若要执行仅运行一次且与源数据无关的操作(例如删除或截断),请使用 preCopyScript 属性。For operations that run only once and have nothing to do with source data, for example, delete or truncate, use the preCopyScript property.
请参阅调用 SQL 接收器的存储过程中的示例。See example from Invoke a stored procedure from a SQL sink.
No
storedProcedureTableTypeParameterNamestoredProcedureTableTypeParameterName 存储过程中指定的表类型的参数名称。The parameter name of the table type specified in the stored procedure. No
sqlWriterTableTypesqlWriterTableType 要在存储过程中使用的表类型名称。The table type name to be used in the stored procedure. 通过复制活动,使移动数据在具备此表类型的临时表中可用。The copy activity makes the data being moved available in a temp table with this table type. 然后,存储过程代码可合并复制数据和现有数据。Stored procedure code can then merge the data that's being copied with existing data. No
storedProcedureParametersstoredProcedureParameters 存储过程的参数。Parameters for the stored procedure.
允许的值为名称和值对。Allowed values are name and value pairs. 参数的名称和大小写必须与存储过程参数的名称和大小写匹配。Names and casing of parameters must match the names and casing of the stored procedure parameters.
No
writeBatchSizewriteBatchSize 每批要插入到 SQL 表中的行数。Number of rows to insert into the SQL table per batch.
允许的值为 integer (行数)。The allowed value is integer (number of rows). 默认情况下,Azure 数据工厂会根据行大小动态确定适当的批大小。By default, Azure Data Factory dynamically determines the appropriate batch size based on the row size.
No
writeBatchTimeoutwriteBatchTimeout 超时前等待批插入操作完成的时间。The wait time for the batch insert operation to finish before it times out.
允许的值为 timespanThe allowed value is timespan. 例如“00:30:00”(30 分钟)。An example is “00:30:00” (30 minutes).
No
disableMetricsCollectiondisableMetricsCollection 数据工厂收集指标(如 Azure SQL 数据库 DTU),以获取复制性能优化和建议。Data Factory collects metrics such as Azure SQL Database DTUs for copy performance optimization and recommendations. 如果你担心此行为,请指定 true 将其关闭。If you are concerned with this behavior, specify true to turn it off. 否(默认值为 falseNo (default is false)

示例 1:追加数据Example 1: Append data

"activities":[
    {
        "name": "CopyToAzureSQLDatabase",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<input dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<Azure SQL Database output dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "<source type>"
            },
            "sink": {
                "type": "AzureSqlSink",
                "tableOption": "autoCreate",
                "writeBatchSize": 100000
            }
        }
    }
]

示例 2:在复制过程中调用存储过程Example 2: Invoke a stored procedure during copy

请参阅调用 SQL 接收器的存储过程,了解更多详细信息。Learn more details from Invoke a stored procedure from a SQL sink.

"activities":[
    {
        "name": "CopyToAzureSQLDatabase",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<input dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<Azure SQL Database output dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "<source type>"
            },
            "sink": {
                "type": "AzureSqlSink",
                "sqlWriterStoredProcedureName": "CopyTestStoredProcedureWithParameters",
                "storedProcedureTableTypeParameterName": "MyTable",
                "sqlWriterTableType": "MyTableType",
                "storedProcedureParameters": {
                    "identifier": { "value": "1", "type": "Int" },
                    "stringData": { "value": "str1" }
                }
            }
        }
    }
]

从 SQL 数据库进行并行复制Parallel copy from SQL database

复制活动中的 Azure SQL 数据库连接器提供内置的数据分区,用于并行复制数据。The Azure SQL Database connector in copy activity provides built-in data partitioning to copy data in parallel. 可以在复制活动的“源”表中找到数据分区选项。 You can find data partitioning options on the Source tab of the copy activity.

分区选项的屏幕截图

启用分区复制时,复制活动将对 Azure SQL 数据库源运行并行查询,以按分区加载数据。When you enable partitioned copy, copy activity runs parallel queries against your Azure SQL Database source to load data by partitions. 可通过复制活动中的 parallelCopies 设置控制并行度。The parallel degree is controlled by the parallelCopies setting on the copy activity. 例如,如果将 parallelCopies 设置为 4,则数据工厂会根据指定的分区选项和设置并行生成并运行 4 个查询,每个查询从 Azure SQL 数据库检索一部分数据。For example, if you set parallelCopies to four, Data Factory concurrently generates and runs four queries based on your specified partition option and settings, and each query retrieves a portion of data from your Azure SQL Database.

建议同时启用并行复制和数据分区,尤其是从 Azure SQL 数据库加载大量数据时。You are suggested to enable parallel copy with data partitioning especially when you load large amount of data from your Azure SQL Database. 下面是适用于不同方案的建议配置。The following are suggested configurations for different scenarios. 将数据复制到基于文件的数据存储中时,建议将数据作为多个文件写入文件夹(仅指定文件夹名称),在这种情况下,性能优于写入单个文件。When copying data into file-based data store, it's recommended to write to a folder as multiple files (only specify folder name), in which case the performance is better than writing to a single file.

方案Scenario 建议的设置Suggested settings
从包含物理分区的大型表进行完整加载。Full load from large table, with physical partitions. 分区选项 :表的物理分区。Partition option : Physical partitions of table.

在执行期间,数据工厂将自动检测物理分区并按分区复制数据。During execution, Data Factory automatically detects the physical partitions, and copies data by partitions.

若要检查表是否有物理分区,可参考此查询To check if your table has physical partition or not, you can refer to this query.
从不包含物理分区但包含用于数据分区的整数或日期时间列的大型表进行完整加载。Full load from large table, without physical partitions, while with an integer or datetime column for data partitioning. 分区选项 :动态范围分区。Partition options : Dynamic range partition.
分区列(可选):指定用于对数据进行分区的列。Partition column (optional): Specify the column used to partition data. 如果未指定,将使用索引或主键列。If not specified, the index or primary key column is used.
分区上限和分区下限(可选) :指定是否要确定分区步幅。Partition upper bound and partition lower bound (optional): Specify if you want to determine the partition stride. 这不适用于筛选表中的行,表中的所有行都将进行分区和复制。This is not for filtering the rows in table, all rows in the table will be partitioned and copied. 如果未指定,复制活动会自动检测这些值。If not specified, copy activity auto detect the values.

例如,如果分区列“ID”的值范围为 1 至 100,其下限设置为 20、上限设置为 80,并行复制设置为 4,则数据工厂会按 4 个分区检索数据,ID 范围分别为 <=20、[21, 50]、[51, 80] 和 >=81。For example, if your partition column "ID" has values range from 1 to 100, and you set the lower bound as 20 and the upper bound as 80, with parallel copy as 4, Data Factory retrieves data by 4 partitions - IDs in range <=20, [21, 50], [51, 80], and >=81, respectively.
使用自定义查询从不包含物理分区但包含用于数据分区的整数或日期/日期时间列的表加载大量数据。Load a large amount of data by using a custom query, without physical partitions, while with an integer or date/datetime column for data partitioning. 分区选项 :动态范围分区。Partition options : Dynamic range partition.
查询SELECT * FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition AND <your_additional_where_clause>Query : SELECT * FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition AND <your_additional_where_clause>.
分区列 :指定用于对数据进行分区的列。Partition column : Specify the column used to partition data.
分区上限和分区下限(可选) :指定是否要确定分区步幅。Partition upper bound and partition lower bound (optional): Specify if you want to determine the partition stride. 这不适用于筛选表中的行,查询结果中的所有行都将进行分区和复制。This is not for filtering the rows in table, all rows in the query result will be partitioned and copied. 如果未指定,复制活动会自动检测该值。If not specified, copy activity auto detect the value.

在执行期间,数据工厂会将 ?AdfRangePartitionColumnName 替换为每个分区的实际列名和值范围,并将其发送到 Azure SQL 数据库。During execution, Data Factory replaces ?AdfRangePartitionColumnName with the actual column name and value ranges for each partition, and sends to Azure SQL Database.
例如,如果分区列“ID”的值范围为 1 至 100,其下限设置为 20、上限设置为 80,并行复制设置为 4,则数据工厂会按 4 个分区检索数据,ID 范围分别为 <=20、[21, 50]、[51, 80] 和 >=81。For example, if your partition column "ID" has values range from 1 to 100, and you set the lower bound as 20 and the upper bound as 80, with parallel copy as 4, Data Factory retrieves data by 4 partitions- IDs in range <=20, [21, 50], [51, 80], and >=81, respectively.

下面是针对不同场景的更多示例查询:Here are more sample queries for different scenarios:
1.查询整个表:1. Query the whole table:
SELECT * FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition
2.使用列选择和附加的 where 子句筛选器从表中查询:2. Query from a table with column selection and additional where-clause filters:
SELECT <column_list> FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition AND <your_additional_where_clause>
3.使用子查询进行查询:3. Query with subqueries:
SELECT <column_list> FROM (<your_sub_query>) AS T WHERE ?AdfDynamicRangePartitionCondition AND <your_additional_where_clause>
4.在子查询中使用分区查询:4. Query with partition in subquery:
SELECT <column_list> FROM (SELECT <your_sub_query_column_list> FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition) AS T

使用分区选项加载数据的最佳做法:Best practices to load data with partition option:

  1. 选择独特的列作为分区列(如主键或唯一键),以避免数据倾斜。Choose distinctive column as partition column (like primary key or unique key) to avoid data skew.
  2. 如果表具有内置分区,请使用名为“表的物理分区”分区选项来提升性能。If the table has built-in partition, use partition option "Physical partitions of table" to get better performance.
  3. 如果使用 Azure Integration Runtime 复制数据,则可设置较大的“数据集成单元 (DIU)”(>4) 以利用更多计算资源。If you use Azure Integration Runtime to copy data, you can set larger "Data Integration Units (DIU)" (>4) to utilize more computing resource. 检查此处适用的方案。Check the applicable scenarios there.
  4. 复制并行度”可控制分区数量,将此数字设置得太大有时会损害性能,建议将此数字设置按以下公式计算的值:(DIU 或自承载 IR 节点数)*(2 到 4)。"Degree of copy parallelism" control the partition numbers, setting this number too large sometime hurts the performance, recommend setting this number as (DIU or number of Self-hosted IR nodes) * (2 to 4).

示例:从包含物理分区的大型表进行完整加载Example: full load from large table with physical partitions

"source": {
    "type": "AzureSqlSource",
    "partitionOption": "PhysicalPartitionsOfTable"
}

示例:使用动态范围分区进行查询Example: query with dynamic range partition

"source": {
    "type": "AzureSqlSource",
    "query": "SELECT * FROM <TableName> WHERE ?AdfDynamicRangePartitionCondition AND <your_additional_where_clause>",
    "partitionOption": "DynamicRange",
    "partitionSettings": {
        "partitionColumnName": "<partition_column_name>",
        "partitionUpperBound": "<upper_value_of_partition_column (optional) to decide the partition stride, not as data filter>",
        "partitionLowerBound": "<lower_value_of_partition_column (optional) to decide the partition stride, not as data filter>"
    }
}

检查物理分区的示例查询Sample query to check physical partition

SELECT DISTINCT s.name AS SchemaName, t.name AS TableName, pf.name AS PartitionFunctionName, c.name AS ColumnName, iif(pf.name is null, 'no', 'yes') AS HasPartition
FROM sys.tables AS t
LEFT JOIN sys.objects AS o ON t.object_id = o.object_id
LEFT JOIN sys.schemas AS s ON o.schema_id = s.schema_id
LEFT JOIN sys.indexes AS i ON t.object_id = i.object_id 
LEFT JOIN sys.index_columns AS ic ON ic.partition_ordinal > 0 AND ic.index_id = i.index_id AND ic.object_id = t.object_id 
LEFT JOIN sys.columns AS c ON c.object_id = ic.object_id AND c.column_id = ic.column_id 
LEFT JOIN sys.partition_schemes ps ON i.data_space_id = ps.data_space_id 
LEFT JOIN sys.partition_functions pf ON pf.function_id = ps.function_id 
WHERE s.name='[your schema]' AND t.name = '[your table name]'

如果表具有物理分区,你可看到“HasPartition”为“是”,如下所示。If the table has physical partition, you would see "HasPartition" as "yes" like the following.

SQL 查询结果

有关将数据载入 Azure SQL 数据库的最佳做法Best practice for loading data into Azure SQL Database

将数据复制到 Azure SQL 数据库时,可能需要不同的写入行为:When you copy data into Azure SQL Database, you might require different write behavior:

  • 追加:我的源数据只包含新记录。Append: My source data has only new records.
  • 更新插入:我的源数据包含插入和更新内容。Upsert: My source data has both inserts and updates.
  • 覆盖:我需要每次都重新加载整个维度表。Overwrite: I want to reload an entire dimension table each time.
  • 使用自定义逻辑进行写入:在将数据最终插入目标表之前,我需要额外的处理。Write with custom logic: I need extra processing before the final insertion into the destination table.

有关如何在 Azure 数据工厂中进行配置和最佳做法,请参阅相应的部分。Refer to the respective sections about how to configure in Azure Data Factory and best practices.

追加数据Append data

追加数据是此 Azure SQL 数据库接收器连接器的默认行为。Appending data is the default behavior of this Azure SQL Database sink connector. Azure 数据工厂执行批量插入,以有效地在表中写入数据。Azure Data Factory does a bulk insert to write to your table efficiently. 可以相应地在复制活动中配置源和接收器。You can configure the source and sink accordingly in the copy activity.

更新插入数据Upsert data

选项 1:当需要复制大量数据时,可以使用复制活动将所有记录大容量加载到一个临时表中,然后运行存储过程活动来一次性应用 MERGE 或 INSERT/UPDATE 语句。Option 1: When you have a large amount of data to copy, you can bulk load all records into a staging table by using the copy activity, then run a stored procedure activity to apply a MERGE or INSERT/UPDATE statement in one shot.

复制活动当前并非原生支持将数据加载到数据库临时表中。Copy activity currently doesn't natively support loading data into a database temporary table. 有一种结合多种活动进行设置的高级方法,请参阅优化 Azure SQL 数据库批量更新插入方案There is an advanced way to set it up with a combination of multiple activities, refer to Optimize Azure SQL Database Bulk Upsert scenarios. 下面显示了使用永久表作为暂存的示例。Below shows a sample of using a permanent table as staging.

例如,在 Azure 数据工厂中,可以使用 复制活动 创建一个管道,并将其与 存储过程活动 相链接。As an example, in Azure Data Factory, you can create a pipeline with a Copy activity chained with a Stored Procedure activity. 前者将数据从源存储复制到数据集中的 Azure SQL 数据库临时表(例如,表名为“UpsertStagingTable”的表)。The former copies data from your source store into an Azure SQL Database staging table, for example, UpsertStagingTable , as the table name in the dataset. 然后,后者调用一个存储过程,以将临时表中的源数据合并到目标表中,并清理临时表。Then the latter invokes a stored procedure to merge source data from the staging table into the target table and clean up the staging table.

Upsert

在数据库中使用 MERGE 逻辑定义一个存储过程(如以下示例所示),以便从上述存储过程活动指向该过程。In your database, define a stored procedure with MERGE logic, like the following example, which is pointed to from the previous stored procedure activity. 假设目标是包含三个列的 Marketing 表: ProfileIDStateCategoryAssume that the target is the Marketing table with three columns: ProfileID , State , and Category. 根据 ProfileID 列执行更新插入。Do the upsert based on the ProfileID column.

CREATE PROCEDURE [dbo].[spMergeData]
AS
BEGIN
   MERGE TargetTable AS target
   USING UpsertStagingTable AS source
   ON (target.[ProfileID] = source.[ProfileID])
   WHEN MATCHED THEN
      UPDATE SET State = source.State
    WHEN NOT matched THEN
       INSERT ([ProfileID], [State], [Category])
      VALUES (source.ProfileID, source.State, source.Category);
    TRUNCATE TABLE UpsertStagingTable
END

选项 2:可选择在复制活动中调用存储过程Option 2: You can choose to invoke a stored procedure within the copy activity. 这种方法运行源表中的每个批(由 writeBatchSize 属性控制),而不是在复制活动中使用批量插入作为默认方法。This approach runs each batch (as governed by the writeBatchSize property) in the source table instead of using bulk insert as the default approach in the copy activity.

覆盖整个表Overwrite the entire table

可以在复制活动接收器中配置 preCopyScript 属性。You can configure the preCopyScript property in the copy activity sink. 在此情况下,对于运行的每个复制活动,Azure 数据工厂会先运行脚本。In this case, for each copy activity that runs, Azure Data Factory runs the script first. 然后,运行复制来插入数据。Then it runs the copy to insert the data. 例如,若要使用最新数据覆盖整个表,请指定一个脚本,以先删除所有记录,然后从源批量加载新数据。For example, to overwrite the entire table with the latest data, specify a script to first delete all the records before you bulk load the new data from the source.

使用自定义逻辑写入数据Write data with custom logic

使用自定义逻辑写入数据的步骤与更新插入数据部分中的描述类似。The steps to write data with custom logic are similar to those described in the Upsert data section. 如果在将源数据最终插入目标表之前需要应用额外的处理,则可先将数据加载到临时表,然后再调用存储过程活动,或者在复制活动接收器中调用存储过程来应用数据。When you need to apply extra processing before the final insertion of source data into the destination table, you can load to a staging table then invoke stored procedure activity, or invoke a stored procedure in copy activity sink to apply data.

调用 SQL 接收器的存储过程Invoke a stored procedure from a SQL sink

将数据复制到 Azure SQL 数据库中时,还可以通过对每批源表使用附加参数来配置和调用用户指定的存储过程。When you copy data into Azure SQL Database, you also can configure and invoke a user-specified stored procedure with additional parameters on each batch of the source table. 存储过程功能利用表值参数The stored procedure feature takes advantage of table-valued parameters.

当内置复制机制无法使用时,还可使用存储过程。You can use a stored procedure when built-in copy mechanisms don't serve the purpose. 例如,在将源数据最终插入目标表之前应用额外的处理。An example is when you want to apply extra processing before the final insertion of source data into the destination table. 额外处理的示例包括合并列、查找其他值以及将数据插入多个表。Some extra processing examples are when you want to merge columns, look up additional values, and insert into more than one table.

以下示例演示如何使用存储过程在 Azure SQL 数据库数据库中的表内执行 upsert。The following sample shows how to use a stored procedure to do an upsert into a table in Azure SQL Database. 假设输入数据和接收器 Marketing 表各有三列: ProfileIDStateCategoryAssume that the input data and the sink Marketing table each have three columns: ProfileID , State , and Category. 基于 ProfileID 列执行更新插入,并仅将其应用于名为“ProductA”的特定类别。Do the upsert based on the ProfileID column, and only apply it for a specific category called "ProductA".

  1. 在数据库中,使用与 sqlWriterTableType 相同的名称定义表类型。In your database, define the table type with the same name as sqlWriterTableType. 表类型的架构与输入数据返回的架构相同。The schema of the table type is the same as the schema returned by your input data.

    CREATE TYPE [dbo].[MarketingType] AS TABLE(
        [ProfileID] [varchar](256) NOT NULL,
        [State] [varchar](256) NOT NULL,
        [Category] [varchar](256) NOT NULL
    )
    
  2. 在数据库中,使用与 sqlWriterStoredProcedureName 相同的名称定义存储过程。In your database, define the stored procedure with the same name as sqlWriterStoredProcedureName. 它可处理来自指定源的输入数据,并将其合并到输出表中。It handles input data from your specified source and merges into the output table. 存储过程中的表类型的参数名称与数据集中定义的 tableName 相同。The parameter name of the table type in the stored procedure is the same as tableName defined in the dataset.

    CREATE PROCEDURE spOverwriteMarketing @Marketing [dbo].[MarketingType] READONLY, @category varchar(256)
    AS
    BEGIN
    MERGE [dbo].[Marketing] AS target
    USING @Marketing AS source
    ON (target.ProfileID = source.ProfileID and target.Category = @category)
    WHEN MATCHED THEN
        UPDATE SET State = source.State
    WHEN NOT MATCHED THEN
        INSERT (ProfileID, State, Category)
        VALUES (source.ProfileID, source.State, source.Category);
    END
    
  3. 在 Azure 数据工厂中,在复制活动中定义 SQL sink 节,如下所示:In Azure Data Factory, define the SQL sink section in the copy activity as follows:

    "sink": {
        "type": "AzureSqlSink",
        "sqlWriterStoredProcedureName": "spOverwriteMarketing",
        "storedProcedureTableTypeParameterName": "Marketing",
        "sqlWriterTableType": "MarketingType",
        "storedProcedureParameters": {
            "category": {
                "value": "ProductA"
            }
        }
    }
    

Azure SQL 数据库的数据类型映射Data type mapping for Azure SQL Database

从/向 Azure SQL 数据库复制数据时,以下映射用于从 Azure SQL 数据库数据类型映射到 Azure 数据工厂临时数据类型。When data is copied from or to Azure SQL Database, the following mappings are used from Azure SQL Database data types to Azure Data Factory interim data types. 若要了解复制活动如何将源架构和数据类型映射到接收器,请参阅架构和数据类型映射To learn how the copy activity maps the source schema and data type to the sink, see Schema and data type mappings.

Azure SQL 数据库数据类型Azure SQL Database data type Azure 数据工厂临时数据类型Azure Data Factory interim data type
bigintbigint Int64Int64
binarybinary Byte[]Byte[]
bitbit 布尔Boolean
charchar String, Char[]String, Char[]
datedate DateTimeDateTime
datetimeDatetime DateTimeDateTime
datetime2datetime2 DateTimeDateTime
DatetimeoffsetDatetimeoffset DateTimeOffsetDateTimeOffset
小数Decimal 小数Decimal
FILESTREAM attribute (varbinary(max))FILESTREAM attribute (varbinary(max)) Byte[]Byte[]
FloatFloat DoubleDouble
图像image Byte[]Byte[]
intint Int32Int32
moneymoney 小数Decimal
ncharnchar String, Char[]String, Char[]
ntextntext String, Char[]String, Char[]
numericnumeric 小数Decimal
nvarcharnvarchar String, Char[]String, Char[]
realreal SingleSingle
rowversionrowversion Byte[]Byte[]
smalldatetimesmalldatetime DateTimeDateTime
smallintsmallint Int16Int16
smallmoneysmallmoney 小数Decimal
sql_variantsql_variant ObjectObject
texttext String, Char[]String, Char[]
timetime TimeSpanTimeSpan
timestamptimestamp Byte[]Byte[]
tinyinttinyint ByteByte
uniqueidentifieruniqueidentifier GuidGuid
varbinaryvarbinary Byte[]Byte[]
varcharvarchar String, Char[]String, Char[]
xmlxml StringString

备注

对于映射到十进制临时类型的数据类型,目前复制活动支持的最大精度为 28。For data types that map to the Decimal interim type, currently Copy activity supports precision up to 28. 如果有精度大于 28 的数据,请考虑在 SQL 查询中将其转换为字符串。If you have data with precision larger than 28, consider converting to a string in SQL query.

Lookup 活动属性Lookup activity properties

若要了解有关属性的详细信息,请查看 Lookup 活动To learn details about the properties, check Lookup activity.

GetMetadata 活动属性GetMetadata activity properties

若要了解有关属性的详细信息,请查看 GetMetadata 活动To learn details about the properties, check GetMetadata activity

使用 Always EncryptedUsing Always Encrypted

使用 Always Encrypted 从/向 Azure SQL 数据库复制数据时,请通过自承载 Integration Runtime 使用通用 ODBC 连接器和 SQL Server ODBC 驱动程序。When you copy data from/to Azure SQL Database with Always Encrypted, use generic ODBC connector and SQL Server ODBC driver via Self-hosted Integration Runtime. 此 Azure SQL 数据库连接器目前不支持 Always Encrypted。This Azure SQL Database connector does not support Always Encrypted now.

更具体地说:More specifically:

  1. 安装自承载 Integration Runtime(如果没有)。Set up a Self-hosted Integration Runtime if you don't have one. 有关详细信息,请参阅自承载集成运行时一文。See Self-hosted Integration Runtime article for details.

  2. 此处下载适用于 SQL Server 的 64 位 ODBC 驱动程序,并将其安装在 Integration Runtime 计算机上。Download the 64-bit ODBC driver for SQL Server from here, and install on the Integration Runtime machine. 若要详细了解此驱动程序的工作原理,请参阅在适用于 SQL Server 的 ODBC 驱动程序中使用 Always EncryptedLearn more about how this driver works from Using Always Encrypted with the ODBC Driver for SQL Server.

  3. 若要创建 ODBC 类型的链接服务以连接到 SQL 数据库,请参阅以下示例:Create linked service with ODBC type to connect to your SQL database, refer to the following samples:

    • 若要使用“SQL 身份验证”,请执行以下操作:如下所示指定 ODBC 连接字符串,并选择“基本”身份验证以设置用户名和密码。To use SQL authentication : Specify the ODBC connection string as below, and select Basic authentication to set the user name and password.

      Driver={ODBC Driver 17 for SQL Server};Server=<serverName>;Database=<databaseName>;ColumnEncryption=Enabled;KeyStoreAuthentication=KeyVaultClientSecret;KeyStorePrincipalId=<servicePrincipalKey>;KeyStoreSecret=<servicePrincipalKey>
      
    • 若要使用“数据工厂托管标识身份验证”,请执行以下操作:To use Data Factory Managed Identity authentication :

      1. 按照相同的先决条件为托管标识创建数据库用户,并在数据库中授予适当的角色。Follow the same prerequisites to create database user for the managed identity and grant the proper role in your database.
      2. 在链接服务中,如下所示指定 ODBC 连接字符串,并选择“匿名”身份验证,因为连接字符串本身指示 Authentication=ActiveDirectoryMsiIn linked service, specify the ODBC connection string as below, and select Anonymous authentication as the connection string itself indicatesAuthentication=ActiveDirectoryMsi.
      Driver={ODBC Driver 17 for SQL Server};Server=<serverName>;Database=<databaseName>;ColumnEncryption=Enabled;KeyStoreAuthentication=KeyVaultClientSecret;KeyStorePrincipalId=<servicePrincipalKey>;KeyStoreSecret=<servicePrincipalKey>; Authentication=ActiveDirectoryMsi;
      
  4. 相应地使用 ODBC 类型创建数据集和复制活动。Create dataset and copy activity with ODBC type accordingly. 若要了解详细信息,请参阅 ODBC 连接器一文。Learn more from ODBC connector article.

后续步骤Next steps

有关 Azure 数据工厂中复制活动支持用作源和接收器的数据存储的列表,请参阅支持的数据存储和格式For a list of data stores supported as sources and sinks by the copy activity in Azure Data Factory, see Supported data stores and formats.