使用 PowerShell 通过 Azure 数据工厂批量复制多个表Copy multiple tables in bulk by using Azure Data Factory using PowerShell

适用于:是 Azure 数据工厂否 Azure Synapse Analytics(预览版)APPLIES TO: yesAzure Data Factory noAzure Synapse Analytics (Preview)

本教程演示如何将 Azure SQL 数据库中的多个表复制到 Azure SQL 数据仓库This tutorial demonstrates copying a number of tables from Azure SQL Database to Azure SQL Data Warehouse. 在其他复制方案中,也可以应用相同的模式。You can apply the same pattern in other copy scenarios as well. 例如,将 SQL Server/Oracle 中的表复制到 Azure SQL 数据库/数据仓库/Azure Blob,将 Blob 中的不同路径复制到 Azure SQL 数据库表。For example, copying tables from SQL Server/Oracle to Azure SQL Database/Data Warehouse/Azure Blob, copying different paths from Blob to Azure SQL Database tables.

从较高层面讲,本教程涉及以下步骤:At a high level, this tutorial involves following steps:

  • 创建数据工厂。Create a data factory.
  • 创建 Azure SQL 数据库、Azure SQL 数据仓库和 Azure 存储链接服务。Create Azure SQL Database, Azure SQL Data Warehouse, and Azure Storage linked services.
  • 创建 Azure SQL 数据库和 Azure SQL 数据仓库数据集。Create Azure SQL Database and Azure SQL Data Warehouse datasets.
  • 创建一个管道用于查找要复制的表,创建另一个管道用于执行实际复制操作。Create a pipeline to look up the tables to be copied and another pipeline to perform the actual copy operation.
  • 启动管道运行。Start a pipeline run.
  • 监视管道和活动运行。Monitor the pipeline and activity runs.

本教程使用 Azure PowerShell。This tutorial uses Azure PowerShell. 若要了解如何使用其他工具/SDK 创建数据工厂,请参阅快速入门To learn about using other tools/SDKs to create a data factory, see Quickstarts.

端到端工作流End-to-end workflow

在本方案中,Azure SQL 数据库中包含一些我们想要复制到 SQL 数据仓库的表。In this scenario, we have a number of tables in Azure SQL Database that we want to copy to SQL Data Warehouse. 下面是管道中发生的工作流中的逻辑步骤顺序:Here is the logical sequence of steps in the workflow that happens in pipelines:

工作流

  • 第一个管道查找需要复制到接收器数据存储的表列表。The first pipeline looks up the list of tables that needs to be copied over to the sink data stores. 也可以维护一个元数据表用于列出要复制到接收器数据存储的所有表。Alternatively you can maintain a metadata table that lists all the tables to be copied to the sink data store. 然后,该管道触发另一个管道,后者循环访问数据库中的每个表并执行数据复制操作。Then, the pipeline triggers another pipeline, which iterates over each table in the database and performs the data copy operation.
  • 第二个管道执行实际复制。The second pipeline performs the actual copy. 它使用表列表作为参数。It takes the list of tables as a parameter. 对于列表中的每个表,为获得最佳性能,会使用通过 Blob 存储和 PolyBase 进行的分阶段复制,将 Azure SQL 数据库中的特定表复制到 SQL 数据仓库中的相应表。For each table in the list, copy the specific table in Azure SQL Database to the corresponding table in SQL Data Warehouse using staged copy via Blob storage and PolyBase for best performance. 在本示例中,第一个管道传递表列表作为参数值。In this example, the first pipeline passes the list of tables as a value for the parameter.

如果没有 Azure 订阅,可在开始前创建一个 1 元人民币试用帐户。If you don't have an Azure subscription, create a 1rmb trial account before you begin.

先决条件Prerequisites

备注

本文进行了更新,以便使用新的 Azure PowerShell Az 模块。This article has been updated to use the new Azure PowerShell Az module. 你仍然可以使用 AzureRM 模块,至少在 2020 年 12 月之前,它将继续接收 bug 修补程序。You can still use the AzureRM module, which will continue to receive bug fixes until at least December 2020. 若要详细了解新的 Az 模块和 AzureRM 兼容性,请参阅新 Azure Powershell Az 模块简介To learn more about the new Az module and AzureRM compatibility, see Introducing the new Azure PowerShell Az module. 有关 Az 模块安装说明,请参阅安装 Azure PowerShellFor Az module installation instructions, see Install Azure PowerShell.

  • Azure PowerShellAzure PowerShell. 遵循如何安装和配置 Azure PowerShell 中的说明。Follow the instructions in How to install and configure Azure PowerShell.
  • Azure 存储帐户Azure Storage account. Azure 存储帐户用作批量复制操作中的过渡 Blob 存储。The Azure Storage account is used as staging blob storage in the bulk copy operation.
  • Azure SQL 数据库Azure SQL Database. 此数据库包含源数据。This database contains the source data.
  • Azure SQL 数据仓库Azure SQL Data Warehouse. 此数据仓库包含从 SQL 数据库复制的数据。This data warehouse holds the data copied over from the SQL Database.

准备 SQL 数据库和 SQL 数据仓库Prepare SQL Database and SQL Data Warehouse

准备源 Azure SQL 数据库Prepare the source Azure SQL Database:

按照在 Azure SQL 数据库中创建数据库一文,使用 Adventure Works LT 示例数据在 Azure SQL 数据库中创建一个数据库。Create a database with the Adventure Works LT sample data in SQL Database by following Create a database in Azure SQL Database article. 本教程将此示例数据库中的所有表复制到 SQL 数据仓库。This tutorial copies all the tables from this sample database to a SQL data warehouse.

准备接收器 Azure SQL 数据仓库Prepare the sink Azure SQL Data Warehouse:

  1. 如果没有 Azure SQL 数据仓库,请参阅创建 Azure SQL 数据仓库一文了解创建数据仓库的步骤。If you don't have an Azure SQL Data Warehouse, see the Create a SQL Data Warehouse article for steps to create one.

  2. 在 SQL 数据仓库中创建相应的表架构。Create corresponding table schemas in SQL Data Warehouse. 后面的步骤使用 Azure 数据工厂迁移/复制数据。You use Azure Data Factory to migrate/copy data in a later step.

Azure 服务访问 SQL 服务器Azure services to access SQL server

对于 SQL 数据库和 SQL 数据仓库,请允许 Azure 服务访问 SQL 服务器。For both SQL Database and SQL Data Warehouse, allow Azure services to access SQL server. 确保服务器的“允许访问 Azure 服务”设置已切换为“打开”状态 。Ensure that Allow access to Azure services setting is turned ON for your server. 此设置允许数据工厂服务从 Azure SQL 数据库中读取数据,并将数据写入 Azure SQL 数据仓库。This setting allows the Data Factory service to read data from your Azure SQL Database and write data to your Azure SQL Data Warehouse. 若要验证并启用此设置,请执行以下步骤:To verify and turn on this setting, do the following steps:

  1. 单击左侧的“所有服务”,然后单击“SQL Server”。Click All services on the left and click SQL servers.
  2. 选择服务器,并单击“设置”下的“防火墙”。Select your server, and click Firewall under SETTINGS.
  3. 在“防火墙设置”页中,单击“允许访问 Azure 服务”对应的“打开”。 In the Firewall settings page, click ON for Allow access to Azure services.

创建数据工厂Create a data factory

  1. 启动 PowerShellLaunch PowerShell. 在本教程结束之前,请将 Azure PowerShell 保持打开状态。Keep Azure PowerShell open until the end of this tutorial. 如果将它关闭再重新打开,则需要再次运行下述命令。If you close and reopen, you need to run the commands again.

    运行以下命令并输入用于登录 Azure 门户的用户名和密码:Run the following command, and enter the user name and password that you use to sign in to the Azure portal:

    Connect-AzAccount -Environment AzureChinaCloud
    

    运行以下命令查看此帐户的所有订阅:Run the following command to view all the subscriptions for this account:

    Get-AzSubscription
    

    运行以下命令选择要使用的订阅。Run the following command to select the subscription that you want to work with. 请将 SubscriptionId 替换为自己的 Azure 订阅的 ID:Replace SubscriptionId with the ID of your Azure subscription:

    Select-AzSubscription -SubscriptionId "<SubscriptionId>"
    
  2. 运行 Set-AzDataFactoryV2 cmdlet 创建数据工厂。Run the Set-AzDataFactoryV2 cmdlet to create a data factory. 执行该命令之前,请将占位符替换为自己的值。Replace place-holders with your own values before executing the command.

    $resourceGroupName = "<your resource group to create the factory>"
    $dataFactoryName = "<specify the name of data factory to create. It must be globally unique.>"
    Set-AzDataFactoryV2 -ResourceGroupName $resourceGroupName -Location "China East 2" -Name $dataFactoryName
    

    请注意以下几点:Note the following points:

    • Azure 数据工厂的名称必须全局唯一。The name of the Azure data factory must be globally unique. 如果收到以下错误,请更改名称并重试。If you receive the following error, change the name and try again.

      The specified Data Factory name 'ADFv2QuickStartDataFactory' is already in use. Data Factory names must be globally unique.
      
    • 只有 Azure 订阅的参与者或管理员才可以创建数据工厂实例。To create Data Factory instances, you must be a Contributor or Administrator of the Azure subscription.

    • 若要查看目前提供数据工厂的 Azure 区域的列表,请在以下页面上选择感兴趣的区域,然后展开“分析”以找到“数据工厂”:可用产品(按区域)For a list of Azure regions in which Data Factory is currently available, select the regions that interest you on the following page, and then expand Analytics to locate Data Factory: Products available by region. 数据工厂使用的数据存储(Azure 存储、Azure SQL 数据库,等等)和计算资源(HDInsight 等)可以位于其他区域中。The data stores (Azure Storage, Azure SQL Database, etc.) and computes (HDInsight, etc.) used by data factory can be in other regions.

创建链接服务Create linked services

本教程分别为源、接收器和过渡 Blob 创建了三个链接服务,其中包括数据存储的连接:In this tutorial, you create three linked services for source, sink, and staging blob respectively, which includes connections to your data stores:

创建源 Azure SQL 数据库链接服务Create the source Azure SQL Database linked service

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 AzureSqlDatabaseLinkedService.json 的 JSON 文件:(如果 ADFv2TutorialBulkCopy 文件夹尚不存在,则创建该文件夹。)Create a JSON file named AzureSqlDatabaseLinkedService.json in C:\ADFv2TutorialBulkCopy folder with the following content: (Create the folder ADFv2TutorialBulkCopy if it does not already exist.)

    重要

    保存文件之前,请将 <servername>、<databasename>、<username>@<servername> 和 <password> 替换为 Azure SQL 数据库的值。Replace <servername>, <databasename>, <username>@<servername> and <password> with values of your Azure SQL Database before saving the file.

    {
        "name": "AzureSqlDatabaseLinkedService",
        "properties": {
            "type": "AzureSqlDatabase",
            "typeProperties": {
                "connectionString": "Server=tcp:<servername>.database.chinacloudapi.cn,1433;Database=<databasename>;User ID=<username>@<servername>;Password=<password>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30"
            }
        }
    }
    
  2. Azure PowerShell 中,切换到 ADFv2TutorialBulkCopy 文件夹。In Azure PowerShell, switch to the ADFv2TutorialBulkCopy folder.

  3. 运行 Set-AzDataFactoryV2LinkedService cmdlet 来创建链接服务:AzureSqlDatabaseLinkedServiceRun the Set-AzDataFactoryV2LinkedService cmdlet to create the linked service: AzureSqlDatabaseLinkedService.

    Set-AzDataFactoryV2LinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureSqlDatabaseLinkedService" -File ".\AzureSqlDatabaseLinkedService.json"
    

    下面是示例输出:Here is the sample output:

    LinkedServiceName : AzureSqlDatabaseLinkedService
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    Properties        : Microsoft.Azure.Management.DataFactory.Models.AzureSqlDatabaseLinkedService
    

创建接收器 Azure SQL 数据仓库链接服务Create the sink Azure SQL Data Warehouse linked service

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 AzureSqlDWLinkedService.json 的 JSON 文件:Create a JSON file named AzureSqlDWLinkedService.json in the C:\ADFv2TutorialBulkCopy folder, with the following content:

    重要

    保存文件之前,请将 <servername>、<databasename>、<username>@<servername> 和 <password> 替换为 Azure SQL 数据库的值。Replace <servername>, <databasename>, <username>@<servername> and <password> with values of your Azure SQL Database before saving the file.

    {
        "name": "AzureSqlDWLinkedService",
        "properties": {
            "type": "AzureSqlDW",
            "typeProperties": {
                "connectionString": "Server=tcp:<servername>.database.chinacloudapi.cn,1433;Database=<databasename>;User ID=<username>@<servername>;Password=<password>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30"
            }
        }
    }
    
  2. 若要创建链接服务 AzureSqlDWLinkedService,请运行 Set-AzDataFactoryV2LinkedService cmdlet。To create the linked service: AzureSqlDWLinkedService, run the Set-AzDataFactoryV2LinkedService cmdlet.

    Set-AzDataFactoryV2LinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureSqlDWLinkedService" -File ".\AzureSqlDWLinkedService.json"
    

    下面是示例输出:Here is the sample output:

    LinkedServiceName : AzureSqlDWLinkedService
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    Properties        : Microsoft.Azure.Management.DataFactory.Models.AzureSqlDWLinkedService
    

创建过渡 Azure 存储链接服务Create the staging Azure Storage linked service

本教程使用 Azure Blob 存储作为临时过渡区域,以利用 PolyBase 来实现更好的复制性能。In this tutorial, you use Azure Blob storage as an interim staging area to enable PolyBase for a better copy performance.

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 AzureStorageLinkedService.json 的 JSON 文件:Create a JSON file named AzureStorageLinkedService.json in the C:\ADFv2TutorialBulkCopy folder, with the following content:

    重要

    将 <accountName> 和 <accountKey> 分别替换为 Azure 存储帐户的名称和密钥,然后保存文件。Replace <accountName> and <accountKey> with name and key of your Azure storage account before saving the file.

    {
        "name": "AzureStorageLinkedService",
        "properties": {
            "type": "AzureStorage",
            "typeProperties": {
                "connectionString": "DefaultEndpointsProtocol=https;AccountName=<accountName>;AccountKey=<accountKey>;EndpointSuffix=core.chinacloudapi.cn"
            }
        }
    }
    
  2. 若要创建链接服务 AzureStorageLinkedService,请运行 Set-AzDataFactoryV2LinkedService cmdlet。To create the linked service: AzureStorageLinkedService, run the Set-AzDataFactoryV2LinkedService cmdlet.

    Set-AzDataFactoryV2LinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureStorageLinkedService" -File ".\AzureStorageLinkedService.json"
    

    下面是示例输出:Here is the sample output:

    LinkedServiceName : AzureStorageLinkedService
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    Properties        : Microsoft.Azure.Management.DataFactory.Models.AzureStorageLinkedService
    

创建数据集Create datasets

在本教程中创建源和接收器数据集,用于指定数据的存储位置:In this tutorial, you create source and sink datasets, which specify the location where the data is stored:

为源 SQL 数据库创建数据集Create a dataset for source SQL Database

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 AzureSqlDatabaseDataset.json 的 JSON 文件。Create a JSON file named AzureSqlDatabaseDataset.json in the C:\ADFv2TutorialBulkCopy folder, with the following content. “tableName”是一个虚构名称,因为稍后要在复制活动中使用 SQL 查询检索数据。The "tableName" is a dummy one as later you use the SQL query in copy activity to retrieve data.

    {
        "name": "AzureSqlDatabaseDataset",
        "properties": {
            "type": "AzureSqlTable",
            "linkedServiceName": {
                "referenceName": "AzureSqlDatabaseLinkedService",
                "type": "LinkedServiceReference"
            },
            "typeProperties": {
                "tableName": "dummy"
            }
        }
    }
    
  2. 若要创建数据集 AzureSqlDatabaseDataset,请运行 Set-AzDataFactoryV2Dataset cmdlet。To create the dataset: AzureSqlDatabaseDataset, run the Set-AzDataFactoryV2Dataset cmdlet.

    Set-AzDataFactoryV2Dataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureSqlDatabaseDataset" -File ".\AzureSqlDatabaseDataset.json"
    

    下面是示例输出:Here is the sample output:

    DatasetName       : AzureSqlDatabaseDataset
    ResourceGroupName : <resourceGroupname>
    DataFactoryName   : <dataFactoryName>
    Structure         :
    Properties        : Microsoft.Azure.Management.DataFactory.Models.AzureSqlTableDataset
    

为接收器 SQL 数据仓库创建数据集Create a dataset for sink SQL Data Warehouse

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 AzureSqlDWDataset.json 的 JSON 文件:将“tableName”设置为参数,稍后引用此数据集的复制活动会将实际值传递给数据集。Create a JSON file named AzureSqlDWDataset.json in the C:\ADFv2TutorialBulkCopy folder, with the following content: The "tableName" is set as a parameter, later the copy activity that references this dataset passes the actual value into the dataset.

    {
        "name": "AzureSqlDWDataset",
        "properties": {
            "type": "AzureSqlDWTable",
            "linkedServiceName": {
                "referenceName": "AzureSqlDWLinkedService",
                "type": "LinkedServiceReference"
            },
            "typeProperties": {
                "tableName": {
                    "value": "@{dataset().DWTableName}",
                    "type": "Expression"
                }
            },
            "parameters":{
                "DWTableName":{
                    "type":"String"
                }
            }
        }
    }
    
  2. 若要创建数据集 AzureSqlDWDataset,请运行 Set-AzDataFactoryV2Dataset cmdlet。To create the dataset: AzureSqlDWDataset, run the Set-AzDataFactoryV2Dataset cmdlet.

    Set-AzDataFactoryV2Dataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureSqlDWDataset" -File ".\AzureSqlDWDataset.json"
    

    下面是示例输出:Here is the sample output:

    DatasetName       : AzureSqlDWDataset
    ResourceGroupName : <resourceGroupname>
    DataFactoryName   : <dataFactoryName>
    Structure         :
    Properties        : Microsoft.Azure.Management.DataFactory.Models.AzureSqlDwTableDataset
    

创建管道Create pipelines

在本教程中创建两个管道:In this tutorial, you create two pipelines:

创建管道“IterateAndCopySQLTables”Create the pipeline "IterateAndCopySQLTables"

此管道使用表列表作为参数。This pipeline takes a list of tables as a parameter. 对于列表中的每个表,此管道会使用分阶段复制和 PolyBase,将 Azure SQL 数据库中的表的数据复制到 Azure SQL 数据仓库。For each table in the list, it copies data from the table in Azure SQL Database to Azure SQL Data Warehouse using staged copy and PolyBase.

  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 IterateAndCopySQLTables.json 的 JSON 文件:Create a JSON file named IterateAndCopySQLTables.json in the C:\ADFv2TutorialBulkCopy folder, with the following content:

    {
        "name": "IterateAndCopySQLTables",
        "properties": {
            "activities": [
                {
                    "name": "IterateSQLTables",
                    "type": "ForEach",
                    "typeProperties": {
                        "isSequential": "false",
                        "items": {
                            "value": "@pipeline().parameters.tableList",
                            "type": "Expression"
                        },
                        "activities": [
                            {
                                "name": "CopyData",
                                "description": "Copy data from Azure SQL Database to SQL DW",
                                "type": "Copy",
                                "inputs": [
                                    {
                                        "referenceName": "AzureSqlDatabaseDataset",
                                        "type": "DatasetReference"
                                    }
                                ],
                                "outputs": [
                                    {
                                        "referenceName": "AzureSqlDWDataset",
                                        "type": "DatasetReference",
                                        "parameters": {
                                            "DWTableName": "[@{item().TABLE_SCHEMA}].[@{item().TABLE_NAME}]"
                                        }
                                    }
                                ],
                                "typeProperties": {
                                    "source": {
                                        "type": "SqlSource",
                                        "sqlReaderQuery": "SELECT * FROM [@{item().TABLE_SCHEMA}].[@{item().TABLE_NAME}]"
                                    },
                                    "sink": {
                                        "type": "SqlDWSink",
                                        "preCopyScript": "TRUNCATE TABLE [@{item().TABLE_SCHEMA}].[@{item().TABLE_NAME}]",
                                        "allowPolyBase": true
                                    },
                                    "enableStaging": true,
                                    "stagingSettings": {
                                        "linkedServiceName": {
                                            "referenceName": "AzureStorageLinkedService",
                                            "type": "LinkedServiceReference"
                                        }
                                    }
                                }
                            }
                        ]
                    }
                }
            ],
            "parameters": {
                "tableList": {
                    "type": "Object"
                }
            }
        }
    }
    
  2. 若要创建管道 IterateAndCopySQLTables,请运行 Set-AzDataFactoryV2Pipeline cmdlet。To create the pipeline: IterateAndCopySQLTables, Run the Set-AzDataFactoryV2Pipeline cmdlet.

    Set-AzDataFactoryV2Pipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "IterateAndCopySQLTables" -File ".\IterateAndCopySQLTables.json"
    

    下面是示例输出:Here is the sample output:

    PipelineName      : IterateAndCopySQLTables
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    Activities        : {IterateSQLTables}
    Parameters        : {[tableList, Microsoft.Azure.Management.DataFactory.Models.ParameterSpecification]}
    

创建管道“GetTableListAndTriggerCopyData”Create the pipeline "GetTableListAndTriggerCopyData"

此管道执行两个步骤:This pipeline performs two steps:

  • 查找 Azure SQL 数据库系统表,以获取要复制的表列表。Looks up the Azure SQL Database system table to get the list of tables to be copied.
  • 触发管道“IterateAndCopySQLTables”来执行实际数据复制。Triggers the pipeline "IterateAndCopySQLTables" to do the actual data copy.
  1. C:\ADFv2TutorialBulkCopy 文件夹中,创建包含以下内容的名为 GetTableListAndTriggerCopyData.json 的 JSON 文件:Create a JSON file named GetTableListAndTriggerCopyData.json in the C:\ADFv2TutorialBulkCopy folder, with the following content:

    {
        "name":"GetTableListAndTriggerCopyData",
        "properties":{
            "activities":[
                { 
                    "name": "LookupTableList",
                    "description": "Retrieve the table list from Azure SQL dataabse",
                    "type": "Lookup",
                    "typeProperties": {
                        "source": {
                            "type": "SqlSource",
                            "sqlReaderQuery": "SELECT TABLE_SCHEMA, TABLE_NAME FROM information_schema.TABLES WHERE TABLE_TYPE = 'BASE TABLE' and TABLE_SCHEMA = 'SalesLT' and TABLE_NAME <> 'ProductModel'"
                        },
                        "dataset": {
                            "referenceName": "AzureSqlDatabaseDataset",
                            "type": "DatasetReference"
                        },
                        "firstRowOnly": false
                    }
                },
                {
                    "name": "TriggerCopy",
                    "type": "ExecutePipeline",
                    "typeProperties": {
                        "parameters": {
                            "tableList": {
                                "value": "@activity('LookupTableList').output.value",
                                "type": "Expression"
                            }
                        },
                        "pipeline": {
                            "referenceName": "IterateAndCopySQLTables",
                            "type": "PipelineReference"
                        },
                        "waitOnCompletion": true
                    },
                    "dependsOn": [
                        {
                            "activity": "LookupTableList",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ]
                }
            ]
        }
    }
    
  2. 若要创建管道 GetTableListAndTriggerCopyData,请运行 Set-AzDataFactoryV2Pipeline cmdlet。To create the pipeline: GetTableListAndTriggerCopyData, Run the Set-AzDataFactoryV2Pipeline cmdlet.

    Set-AzDataFactoryV2Pipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "GetTableListAndTriggerCopyData" -File ".\GetTableListAndTriggerCopyData.json"
    

    下面是示例输出:Here is the sample output:

    PipelineName      : GetTableListAndTriggerCopyData
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    Activities        : {LookupTableList, TriggerCopy}
    Parameters        :
    

启动并监视管道运行Start and monitor pipeline run

  1. 针对“GetTableListAndTriggerCopyData”主管道启动管道运行,并捕获管道运行 ID,以便将来进行监视。Start a pipeline run for the main "GetTableListAndTriggerCopyData" pipeline and capture the pipeline run ID for future monitoring. 随后,此管道根据 ExecutePipeline 活动中的指定,触发管道“IterateAndCopySQLTables”的运行。Underneath, it triggers the run for pipeline "IterateAndCopySQLTables" as specified in ExecutePipeline activity.

    $runId = Invoke-AzDataFactoryV2Pipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineName 'GetTableListAndTriggerCopyData'
    
  2. 运行以下脚本可持续检查管道 GetTableListAndTriggerCopyData 的运行状态,并列显最终管道运行和活动运行的结果。Run the following script to continuously check the run status of pipeline GetTableListAndTriggerCopyData, and print out the final pipeline run and activity run result.

    while ($True) {
        $run = Get-AzDataFactoryV2PipelineRun -ResourceGroupName $resourceGroupName -DataFactoryName $DataFactoryName -PipelineRunId $runId
    
        if ($run) {
            if ($run.Status -ne 'InProgress') {
                Write-Host "Pipeline run finished. The status is: " $run.Status -foregroundcolor "Yellow"
                Write-Host "Pipeline run details:" -foregroundcolor "Yellow"
                $run
                break
            }
            Write-Host  "Pipeline is running...status: InProgress" -foregroundcolor "Yellow"
        }
    
        Start-Sleep -Seconds 15
    }
    
    $result = Get-AzDataFactoryV2ActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineRunId $runId -RunStartedAfter (Get-Date).AddMinutes(-30) -RunStartedBefore (Get-Date).AddMinutes(30)
    Write-Host "Activity run details:" -foregroundcolor "Yellow"
    $result
    

    下面是示例运行的输出:Here is the output of the sample run:

    Pipeline run details:
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    RunId             : 0000000000-00000-0000-0000-000000000000
    PipelineName      : GetTableListAndTriggerCopyData
    LastUpdated       : 9/18/2017 4:08:15 PM
    Parameters        : {}
    RunStart          : 9/18/2017 4:06:44 PM
    RunEnd            : 9/18/2017 4:08:15 PM
    DurationInMs      : 90637
    Status            : Succeeded
    Message           : 
    
    Activity run details:
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    ActivityName      : LookupTableList
    PipelineRunId     : 0000000000-00000-0000-0000-000000000000
    PipelineName      : GetTableListAndTriggerCopyData
    Input             : {source, dataset, firstRowOnly}
    Output            : {count, value, effectiveIntegrationRuntime}
    LinkedServiceName : 
    ActivityRunStart  : 9/18/2017 4:06:46 PM
    ActivityRunEnd    : 9/18/2017 4:07:09 PM
    DurationInMs      : 22995
    Status            : Succeeded
    Error             : {errorCode, message, failureType, target}
    
    ResourceGroupName : <resourceGroupName>
    DataFactoryName   : <dataFactoryName>
    ActivityName      : TriggerCopy
    PipelineRunId     : 0000000000-00000-0000-0000-000000000000
    PipelineName      : GetTableListAndTriggerCopyData
    Input             : {pipeline, parameters, waitOnCompletion}
    Output            : {pipelineRunId}
    LinkedServiceName : 
    ActivityRunStart  : 9/18/2017 4:07:11 PM
    ActivityRunEnd    : 9/18/2017 4:08:14 PM
    DurationInMs      : 62581
    Status            : Succeeded
    Error             : {errorCode, message, failureType, target}
    
  3. 可以获取管道“IterateAndCopySQLTables”的运行 ID,并检查详细的活动运行结果,如下所示。You can get the run ID of pipeline "IterateAndCopySQLTables", and check the detailed activity run result as the following.

    Write-Host "Pipeline 'IterateAndCopySQLTables' run result:" -foregroundcolor "Yellow"
    ($result | Where-Object {$_.ActivityName -eq "TriggerCopy"}).Output.ToString()
    

    下面是示例运行的输出:Here is the output of the sample run:

    {
        "pipelineRunId": "7514d165-14bf-41fb-b5fb-789bea6c9e58"
    }
    
    $result2 = Get-AzDataFactoryV2ActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineRunId <copy above run ID> -RunStartedAfter (Get-Date).AddMinutes(-30) -RunStartedBefore (Get-Date).AddMinutes(30)
    $result2
    
  4. 连接到接收器 Azure SQL 数据仓库,并确认是否已正确地从 Azure SQL 数据库复制数据。Connect to your sink Azure SQL Data Warehouse and confirm that data has been copied from Azure SQL Database properly.

后续步骤Next steps

已在本教程中执行了以下步骤:You performed the following steps in this tutorial:

  • 创建数据工厂。Create a data factory.
  • 创建 Azure SQL 数据库、Azure SQL 数据仓库和 Azure 存储链接服务。Create Azure SQL Database, Azure SQL Data Warehouse, and Azure Storage linked services.
  • 创建 Azure SQL 数据库和 Azure SQL 数据仓库数据集。Create Azure SQL Database and Azure SQL Data Warehouse datasets.
  • 创建一个管道用于查找要复制的表,创建另一个管道用于执行实际复制操作。Create a pipeline to look up the tables to be copied and another pipeline to perform the actual copy operation.
  • 启动管道运行。Start a pipeline run.
  • 监视管道和活动运行。Monitor the pipeline and activity runs.

转到以下教程,了解如何以增量方式将数据从源复制到目标:Advance to the following tutorial to learn about copy data incrementally from a source to a destination: