将数据移到 Azure 虚拟机上的 SQL ServerMove data to SQL Server on an Azure virtual machine

本文概述了用于将数据从平面文件(CSV 或 TSV 格式)或从本地 SQL Server 移动到 Azure 虚拟机上的 SQL Server 的选项。This article outlines the options for moving data either from flat files (CSV or TSV formats) or from an on-premises SQL Server to SQL Server on an Azure virtual machine. 将数据移到云中的这些任务是“团队数据科学流程”的一部分。These tasks for moving data to the cloud are part of the Team Data Science Process.

有关概述将数据移到 Azure SQL 数据库进行机器学习的选项的主题,请参阅将数据移到 Azure SQL 数据库进行 Azure 机器学习For a topic that outlines the options for moving data to an Azure SQL Database for Machine Learning, see Move data to an Azure SQL Database for Azure Machine Learning.

下表汇总了用于将数据移到 Azure 虚拟机上的 SQL Server 的选项。The following table summarizes the options for moving data to SQL Server on an Azure virtual machine.

SOURCE 目标:Azure VM 上的 SQL ServerDESTINATION: SQL Server on Azure VM
平面文件Flat File 1.命令行大容量复制实用程序 (BCP)1. Command-line bulk copy utility (BCP)
2.批量插入 SQL 查询2. Bulk Insert SQL Query
3.SQL Server 中的图形内置实用程序3. Graphical Built-in Utilities in SQL Server
本地 SQL ServerOn-Premises SQL Server 1.将 SQL Server 数据库部署到 Microsoft Azure 虚拟机向导1. Deploy a SQL Server Database to a Microsoft Azure VM wizard
2.导出到平面文件2. Export to a flat File
3.SQL 数据库迁移向导3. SQL Database Migration Wizard
4.数据库备份和还原4. Database back up and restore

本文档假设从 SQL Server Management Studio 或 Visual Studio 数据库资源管理器执行 SQL 命令。This document assumes that SQL commands are executed from SQL Server Management Studio or Visual Studio Database Explorer.

提示

也可以使用 Azure 数据工厂来创建和安排会将数据移动到 Azure 上的 SQL Server 虚拟机的管道。As an alternative, you can use Azure Data Factory to create and schedule a pipeline that will move data to a SQL Server VM on Azure. 有关更多信息,请参阅使用 Azure 数据工厂复制数据(复制活动)For more information, see Copy data with Azure Data Factory (Copy Activity).

先决条件Prerequisites

本教程假设你拥有:This tutorial assumes you have:

将数据从平面文件源移动到 Azure VM 上的 SQL ServerMoving data from a flat file source to SQL Server on an Azure VM

如果数据位于平面文件中(以行/列格式排列),则可以通过以下方法将它移到 Azure 上的 SQL Server 虚拟机:If your data is in a flat file (arranged in a row/column format), it can be moved to SQL Server VM on Azure via the following methods:

  1. 命令行大容量复制实用程序 (BCP)Command-line bulk copy utility (BCP)
  2. 批量插入 SQL 查询Bulk Insert SQL Query
  3. SQL Server 中的图形内置实用程序(导入/导出、SSIS)Graphical Built-in Utilities in SQL Server (Import/Export, SSIS)

命令行大容量复制实用程序 (BCP)Command-line bulk copy utility (BCP)

BCP 是随 SQL Server 一起安装的命令行实用程序,并且是数据移动的最快方法之一。BCP is a command-line utility installed with SQL Server and is one of the quickest ways to move data. 它可跨三个 SQL Server 变体(本地 SQL Server、SQL Azure 以及 Azure 上的 SQL Server 虚拟机)运行。It works across all three SQL Server variants (On-premises SQL Server, SQL Azure and SQL Server VM on Azure).

备注

对于 BCP 我的数据应在哪里?Where should my data be for BCP?
尽管并非必需,但是将包含源数据的文件置于目标 SQL Server 所在的计算机上可以更快地进行传输(网络速度与本地磁盘 IO 速度)。While it is not required, having files containing source data located on the same machine as the target SQL Server allows for faster transfers (network speed vs local disk IO speed). 可以使用各种文件复制工具(如 AZCopyAzure 存储资源管理器,或者通过远程桌面协议 (RDP) 进行 Windows 复制/粘贴),将包含数据的平面文件移到已安装 SQL Server 的计算机。You can move the flat files containing data to the machine where SQL Server is installed using various file copying tools such as AZCopy, Azure Storage Explorer or windows copy/paste via Remote Desktop Protocol (RDP).

  1. 确保在目标 SQL Server 数据库上创建数据库和表。Ensure that the database and the tables are created on the target SQL Server database. 下面是如何使用 Create DatabaseCreate Table 命令执行此操作的示例:Here is an example of how to do that using the Create Database and Create Table commands:
CREATE DATABASE <database_name>

CREATE TABLE <tablename>
(
    <columnname1> <datatype> <constraint>,
    <columnname2> <datatype> <constraint>,
    <columnname3> <datatype> <constraint>
)
  1. 通过从已安装 bcp 的计算机的命令行执行以下命令,生成介绍表架构的格式文件。Generate the format file that describes the schema for the table by issuing the following command from the command-line of the machine where bcp is installed.

    bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -S servername\sqlinstance -T -t \t -r \n

  2. 使用 bcp 命令将数据插入数据库,当 SQL Server 安装在同一台计算机上时,该命令应该从命令行运行:Insert the data into the database using the bcp command, which should work from the command line when SQL Server is installed on same machine:

    bcp dbname..tablename in datafilename.tsv -f exportformatfilename.xml -S servername\sqlinstancename -U username -P password -b block_size_to_move_in_single_attempt -t \t -r \n

优化 BCP 插入,请参阅以下文章优化批量导入的指南来优化此类插入。Optimizing BCP Inserts Please refer the following article 'Guidelines for Optimizing Bulk Import' to optimize such inserts.

并行插入可实现更快的数据移动Parallelizing Inserts for Faster Data Movement

如果正在移动的数据很大,则可以通过在 PowerShell 脚本中同时并行执行多个 BCP 命令加快移动速度。If the data you are moving is large, you can speed up things by simultaneously executing multiple BCP commands in parallel in a PowerShell Script.

备注

大型数据引入 若要优化大型和超大型数据集的数据加载,请使用多个文件组和分区表对逻辑数据库和物理数据库表进行分区。Big data Ingestion To optimize data loading for large and very large datasets, partition your logical and physical database tables using multiple file groups and partition tables. 有关创建并将数据加载到分区表的详细信息,请参阅并行加载 SQL 分区表For more information about creating and loading data to partition tables, see Parallel Load SQL Partition Tables.

下面的示例 PowerShell 脚本演示如何使用 bcp 并行插入:The following sample PowerShell script demonstrates parallel inserts using bcp:

$NO_OF_PARALLEL_JOBS=2

Set-ExecutionPolicy RemoteSigned #set execution policy for the script to execute
# Define what each job does
$ScriptBlock = {
    param($partitionnumber)

    #Explicitly using SQL username password
    bcp database..tablename in datafile_path.csv -F 2 -f format_file_path.xml -U username@servername -S tcp:servername -P password -b block_size_to_move_in_single_attempt -t "," -r \n -o path_to_outputfile.$partitionnumber.txt

    #Trusted connection w.o username password (if you are using windows auth and are signed in with that credentials)
    #bcp database..tablename in datafile_path.csv -o path_to_outputfile.$partitionnumber.txt -h "TABLOCK" -F 2 -f format_file_path.xml  -T -b block_size_to_move_in_single_attempt -t "," -r \n
}


# Background processing of all partitions
for ($i=1; $i -le $NO_OF_PARALLEL_JOBS; $i++)
{
    Write-Debug "Submit loading partition # $i"
    Start-Job $ScriptBlock -Arg $i      
}


# Wait for it all to complete
While (Get-Job -State "Running")
{
    Start-Sleep 10
    Get-Job
}

# Getting the information back from the jobs
Get-Job | Receive-Job
Set-ExecutionPolicy Restricted #reset the execution policy

批量插入 SQL 查询Bulk Insert SQL Query

批量插入 SQL 查询可用于将数据从基于行/列的文件导入数据库(受支持的类型在准备用于批量导出或导入的数据 (SQL Server) 主题中有介绍)。Bulk Insert SQL Query can be used to import data into the database from row/column based files (the supported types are covered in thePrepare Data for Bulk Export or Import (SQL Server)) topic.

以下是一些用于批量插入的示例命令:Here are some sample commands for Bulk Insert are as below:

  1. 分析数据并设置任何自定义选项后再导入,以确保 SQL Server 数据库对于任何特殊的字段(例如日期)均假设相同的格式。Analyze your data and set any custom options before importing to make sure that the SQL Server database assumes the same format for any special fields such as dates. 以下是如何将日期格式设置为“年-月-日”(如果数据包含“年-月-日”格式的日期)的示例:Here is an example of how to set the date format as year-month-day (if your data contains the date in year-month-day format):
SET DATEFORMAT ymd;
  1. 使用批量导入语句导入数据:Import data using bulk import statements:
BULK INSERT <tablename>
FROM
'<datafilename>'
WITH
(
    FirstRow = 2,
    FIELDTERMINATOR = ',', --this should be column separator in your data
    ROWTERMINATOR = '\n'   --this should be the row separator in your data
)

SQL Server 中的内置实用程序Built-in Utilities in SQL Server

可以使用 SQL Server 集成服务 (SSIS) 将数据从平面文件导入到 Azure 上的 SQL Server 虚拟机。You can use SQL Server Integrations Services (SSIS) to import data into SQL Server VM on Azure from a flat file. SSIS 在两个 Studio 环境中可用。SSIS is available in two studio environments. 有关详细信息,请参阅集成服务 (SSIS) 与 Studio 环境For details, see Integration Services (SSIS) and Studio Environments:

将数据从本地 SQL Server 移动到 Azure VM 上的 SQL ServerMoving Data from on-premises SQL Server to SQL Server on an Azure VM

此外,还可以使用以下迁移策略:You can also use the following migration strategies:

  1. 将 SQL Server 数据库部署到 Microsoft Azure 虚拟机向导Deploy a SQL Server Database to a Microsoft Azure VM wizard
  2. 导出到平面文件Export to Flat File
  3. SQL 数据库迁移向导SQL Database Migration Wizard
  4. 数据库备份和还原Database back up and restore

下面我们将对各个选项进行介绍:We describe each of these options below:

将 SQL Server 数据库部署到 Microsoft Azure 虚拟机向导Deploy a SQL Server Database to a Microsoft Azure VM wizard

将 SQL Server 数据库部署到 Microsoft Azure 虚拟机向导非常简单,建议采用这种方法将数据从本地 SQL Server 实例移到 Azure 虚拟机上的 SQL Server。The Deploy a SQL Server Database to a Microsoft Azure VM wizard is a simple and recommended way to move data from an on-premises SQL Server instance to SQL Server on an Azure VM. 有关详细的步骤以及其他备选方法的讨论,请参阅将数据库迁移到 Azure 虚拟机上的 SQL ServerFor detailed steps as well as a discussion of other alternatives, see Migrate a database to SQL Server on an Azure VM.

导出到平面文件Export to Flat File

可以使用各种方法从本地 SQL Server 批量导出数据(如批量导入和导出数据 (SQL Server) 主题中所述)。Various methods can be used to bulk export data from an On-Premises SQL Server as documented in the Bulk Import and Export of Data (SQL Server) topic. 本文档将举例说明“大容量复制程序 (BCP)”。This document will cover the Bulk Copy Program (BCP) as an example. 一旦数据导出到平面文件,就可以使用批量导入将其导入到另一个 SQL Server。Once data is exported into a flat file, it can be imported to another SQL server using bulk import.

  1. 使用 bcp 实用程序将数据从本地 SQL Server 导出到文件,如下所示Export the data from on-premises SQL Server to a File using the bcp utility as follows

    bcp dbname..tablename out datafile.tsv -S servername\sqlinstancename -T -t \t -t \n -c

  2. 使用步骤 1 中导出的表架构的 create databasecreate table 在 Azure 上的 SQL Server 虚拟机上创建数据库和表。Create the database and the table on SQL Server VM on Azure using the create database and create table for the table schema exported in step 1.

  3. 创建一个格式文件来描述正导出/导入的数据的表架构。Create a format file for describing the table schema of the data being exported/imported. 创建格式文件 (SQL Server) 中介绍了格式文件的详细信息。Details of the format file are described in Create a Format File (SQL Server).

    从 SQL Server 计算机运行 BCP 时生成格式文件Format file generation when running BCP from the SQL Server computer

    bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -S servername\sqlinstance -T -t \t -r \n

    对 SQL Server 远程运行 BCP 时生成格式文件Format file generation when running BCP remotely against a SQL Server

    bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -U username@servername.database.windows.net -S tcp:servername -P password --t \t -r \n

  4. 使用从文件源移动数据部分中介绍的任意方法将平面文件中的数据移动到 SQL Server。Use any of the methods described in section Moving Data from File Source to move the data in flat files to a SQL Server.

SQL 数据库迁移向导SQL Database Migration Wizard

SQL Server 数据库迁移向导提供了一种可在两个 SQL Server 实例之间移动数据的用户友好方法。SQL Server Database Migration Wizard provides a user-friendly way to move data between two SQL server instances. 它允许用户在源表和目标表之间映射数据架构,选择列类型和各种其他功能。It allows the user to map the data schema between sources and destination tables, choose column types and various other functionalities. 它使用隐式的大容量复制 (BCP)。It uses bulk copy (BCP) under the covers. SQL 数据库迁移向导的欢迎屏幕的屏幕快照如下所示。A screenshot of the welcome screen for the SQL Database Migration wizard is shown below.

SQL Server 迁移向导

数据库备份和还原Database back up and restore

SQL Server 支持:SQL Server supports:

  1. 数据库备份和还原功能(至本地文件或 bacpac 导出到 blob)和数据层应用程序(使用 bacpac)。Database back up and restore functionality (both to a local file or bacpac export to blob) and Data Tier Applications (using bacpac).
  2. 能够使用复制的数据库直接在 Azure 上创建 SQL Server VM 或复制到 SQL 数据库中的现有数据库。Ability to directly create SQL Server VMs on Azure with a copied database or copy to an existing database in SQL Database. 有关详细信息,请参阅 Use the Copy Database WizardFor more information, see Use the Copy Database Wizard.

SQL Server Management Studio 中的数据库备份/还原选项的屏幕快照如下所示。A screenshot of the Database back up/restore options from SQL Server Management Studio is shown below.

SQL Server 导入工具

资源Resources

将数据库迁移到 Azure 虚拟机上的 SQL ServerMigrate a Database to SQL Server on an Azure VM

Azure 虚拟机上 SQL Server 的概述SQL Server on Azure Virtual Machines overview