使用 PowerShell 创建用于将数据从 SQL Server 复制到 Azure 的数据工厂管道
本 PowerShell 脚本示例在 Azure 数据工厂中创建管道,该管道将数据从 SQL Server 数据库复制到 Azure Blob 存储。
注意
建议使用 Azure Az PowerShell 模块与 Azure 交互。 请参阅安装 Azure PowerShell 以开始使用。 若要了解如何迁移到 Az PowerShell 模块,请参阅 将 Azure PowerShell 从 AzureRM 迁移到 Az。
本示例需要 Azure PowerShell。 运行 Get-Module -ListAvailable Az
即可查找版本。
如果需要进行安装或升级,请参阅安装 Azure PowerShell 模块。
运行 Connect-AzAccount -Environment AzureChinaCloud cmdlet 以连接到世纪互联运营的 Azure。
先决条件
- SQL Server。 在本示例中,需将 SQL Server 数据库用作“源”数据存储。
- Azure 存储帐户。 本示例使用 Azure Blob 存储作为“目标/接收器”数据存储。 如果没有 Azure 存储帐户,请参阅创建存储帐户一文获取创建步骤。
- 自承载集成运行时。 从下载中心下载并运行 MSI 文件,在计算机上安装自承载集成运行时。
在 SQL Server 中创建示例数据库
在 SQL Server 数据库中,使用以下 SQL 脚本创建名为“emp”的表:
CREATE TABLE dbo.emp ( ID int IDENTITY(1,1) NOT NULL, FirstName varchar(50), LastName varchar(50), CONSTRAINT PK_emp PRIMARY KEY (ID) ) GO
在该表中插入一些示例数据:
INSERT INTO emp VALUES ('John', 'Doe') INSERT INTO emp VALUES ('Jane', 'Doe')
示例脚本
重要
此脚本在硬盘驱动器上的 c:\ 文件夹中创建 JSON 文件,用于定义数据工厂实体(链接服务、数据集和管道)。
$resourceGroupName = "<Resource group name>"
$dataFactoryName = "<Data factory name>" # must be globally unique
$storageAccountName = "<Az.Storage account name>"
$storageAccountKey = "<Az.Storage account key>"
$sqlServerName = "<SQL server name>"
$sqlDatabaseName = "SQL Server database name"
$sqlTableName = "emp" # create the emp table if it does not already exist in your database with ID, FirstName, and LastName columns of type String.
$sqlUserName = "<SQL Authentication - user name>"
$sqlPassword = "<SQL Authentication - user password>"
$blobFolderPath = "<Azure blob container name>/<Azure blob folder name>"
$integrationRuntimeName = "<Self-hosted integration runtime name"
$pipelineName = "SqlServerToBlobPipeline"
$dataFactoryRegion = "China East 2"
# Create a resource group
New-AzResourceGroup -Name $resourceGroupName -Location $dataFactoryRegion
# create a data factory
$df = Set-AzDataFactory -ResourceGroupName $resourceGroupName -Name $dataFactoryName -Location $dataFactoryRegion
# create a self-hosted integration runtime
Set-AzDataFactoryIntegrationRuntime -Name $integrationRuntimeName -Type SelfHosted -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName
# get the authorization key from the created integration runtime in the cloud
Get-AzDataFactoryIntegrationRuntimeKey -Name $integrationRuntimeName -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName | ConvertTo-Json
# IMPORTANT: Install self-hosted integration runtime on your machine and use one of the keys to register the IR installed on your machine with the cloud service
# create an Az.Storage linked service
## JSON definition of the linked service.
$storageLinkedServiceDefinition = @"
{
"name": "AzureStorageLinkedService",
"properties": {
"type": "AzureStorage",
"typeProperties": {
"connectionString": {
"value": "DefaultEndpointsProtocol=https;AccountName=$storageAccountName;AccountKey=$storageAccountKey;EndpointSuffix=core.chinacloudapi.cn",
"type": "SecureString"
}
}
}
}
"@
## IMPORTANT: stores the JSON definition in a file that will be used by the Set-AzDataFactoryLinkedService command.
$storageLinkedServiceDefinition | Out-File c:\AzureStorageLinkedService.json
## Creates a linked service in the data factory
Set-AzDataFactoryLinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureStorageLinkedService" -File c:\AzureStorageLinkedService.json
# create an on-premises SQL Server linked service
## JSON definition of the linked service.
$sqlServerLinkedServiceDefinition = @"
{
"properties": {
"type": "SqlServer",
"typeProperties": {
"connectionString": {
"type": "SecureString",
"value": "Server=$sqlServerName;Database=$sqlDatabaseName;User ID=$sqlUserName;Password=$sqlPassword;Timeout=60"
}
},
"connectVia": {
"type": "integrationRuntimeReference",
"referenceName": "$integrationRuntimeName"
}
},
"name": "SqlServerLinkedService"
}
"@
## IMPORTANT: stores the JSON definition in a file that will be used by the Set-AzDataFactoryLinkedService command.
$sqlServerLinkedServiceDefinition | Out-File c:\SqlServerLinkedService.json
## Encrypt SQL Server credentials
New-AzDataFactoryLinkedServiceEncryptCredential -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -IntegrationRuntimeName $integrationRuntimeName -File "c:\SqlServerLinkedService.json" > c:\EncryptedSqlServerLinkedService.json
# Create a SQL Server linked service
Set-AzDataFactoryLinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName "EncryptedSqlServerLinkedService" -File "c:\EncryptedSqlServerLinkedService.json"
# Create a source dataset for source SQL Server Database
## JSON definition of the dataset
$sourceSqlServerDatasetDefiniton = @"
{
"properties": {
"type": "SqlServerTable",
"typeProperties": {
"tableName": "$sqlTableName"
},
"structure": [
{
"name": "ID",
"type": "String"
},
{
"name": "FirstName",
"type": "String"
},
{
"name": "LastName",
"type": "String"
}
],
"linkedServiceName": {
"referenceName": "EncryptedSqlServerLinkedService",
"type": "LinkedServiceReference"
}
},
"name": "SqlServerDataset"
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryDataset command.
$sourceSqlServerDatasetDefiniton | Out-File c:\SqlServerDataset.json
# Create an Azure Blob dataset in the data factory
Set-AzDataFactoryDataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "SqlServerDataset" -File "c:\SqlServerDataset.json"
# Create a dataset for sink Azure Blob Storage
## JSON definition of the dataset
$sinkBlobDatasetDefiniton = @"
{
"properties": {
"type": "AzureBlob",
"typeProperties": {
"folderPath": "$blobFolderPath",
"format": {
"type": "TextFormat"
}
},
"linkedServiceName": {
"referenceName": "AzureStorageLinkedService",
"type": "LinkedServiceReference"
}
},
"name": "AzureBlobDataset"
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryDataset command.
$sinkBlobDatasetDefiniton | Out-File c:\AzureBlobDataset.json
## Create the Azure Blob dataset
Set-AzDataFactoryDataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureBlobDataset" -File "c:\AzureBlobDataset.json"
# Create a pipeline in the data factory
## JSON definition of the pipeline
$pipelineDefinition = @"
{
"name": "$pipelineName",
"properties": {
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource"
},
"sink": {
"type":"BlobSink"
}
},
"name": "CopySqlServerToAzureBlobActivity",
"inputs": [
{
"referenceName": "SqlServerDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AzureBlobDataset",
"type": "DatasetReference"
}
]
}
]
}
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryPipeline command.
$pipelineDefinition | Out-File c:\SqlServerToBlobPipeline.json
## Create a pipeline in the data factory
Set-AzDataFactoryPipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "$pipelineName" -File "c:\SqlServerToBlobPipeline.json"
# start the pipeline run
$runId = Invoke-AzDataFactoryPipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineName $pipelineName
# Check the pipeline run status until it finishes the copy operation
while ($True) {
$result = Get-AzDataFactoryActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineRunId $runId -RunStartedAfter (Get-Date).AddMinutes(-30) -RunStartedBefore (Get-Date).AddMinutes(30)
if (($result | Where-Object { $_.Status -eq "InProgress" } | Measure-Object).count -ne 0) {
Write-Host "Pipeline run status: In Progress" -foregroundcolor "Yellow"
Start-Sleep -Seconds 30
}
else {
Write-Host "Pipeline $pipelineName run finished. Result:" -foregroundcolor "Yellow"
$result
break
}
}
# Get the activity run details
$result = Get-AzDataFactoryActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName `
-PipelineRunId $runId `
-RunStartedAfter (Get-Date).AddMinutes(-10) `
-RunStartedBefore (Get-Date).AddMinutes(10) `
-ErrorAction Stop
$result
if ($result.Status -eq "Succeeded") {`
$result.Output -join "`r`n"`
}`
else {`
$result.Error -join "`r`n"`
}
# To remove the data factory from the resource gorup
# Remove-AzDataFactory -Name $dataFactoryName -ResourceGroupName $resourceGroupName
#
# To remove the whole resource group
# Remove-AzResourceGroup -Name $resourceGroupName
清理部署
运行示例脚本后,可以使用以下命令删除资源组以及与其关联的所有资源:
Remove-AzResourceGroup -ResourceGroupName $resourceGroupName
若要从资源组中删除数据工厂,请运行以下命令:
Remove-AzDataFactoryV2 -Name $dataFactoryName -ResourceGroupName $resourceGroupName
脚本说明
此脚本使用以下命令:
命令 | 注释 |
---|---|
New-AzResourceGroup | 创建用于存储所有资源的资源组。 |
Set-AzDataFactoryV2 | 创建数据工厂。 |
New-AzDataFactoryV2LinkedServiceEncryptCredential | 在链接的服务中对凭据进行加密,并使用加密凭据生成新的链接服务定义。 |
Set-AzDataFactoryV2LinkedService | 在数据工厂中创建链接服务。 链接服务可将数据存储或计算链接到数据工厂。 |
Set-AzDataFactoryV2Dataset | 在数据工厂中创建数据集。 数据集表示管道中活动的输入/输出。 |
Set-AzDataFactoryV2Pipeline | 在数据工厂中创建管道。 一个管道包含一个或多个执行某项操作的活动。 在此管道中,复制活动在 Azure Blob 存储中将数据从一个位置复制到另一个位置。 |
Invoke-AzDataFactoryV2Pipeline | 为管道创建运行。 换而言之,就是运行管道。 |
Get-AzDataFactoryV2ActivityRun | 获取管道中活动的运行(活动运行)的相关详细信息。 |
Remove-AzResourceGroup | 删除资源组,包括所有嵌套的资源。 |
相关内容
有关 Azure PowerShell 的详细信息,请参阅 Azure PowerShell 文档。
可以在 Azure 数据工厂 PowerShell 示例中找到其他 Azure 数据工厂 PowerShell 脚本示例。