Azure 数据工厂中复制活动支持的文件格式和压缩编解码器Supported file formats and compression codecs by copy activity in Azure Data Factory

适用于: Azure 数据工厂 Azure Synapse Analytics

本文适用于以下连接器:Amazon S3Azure BlobAzure Data Lake Storage Gen2Azure 文件存储文件系统FTPGoogle 云存储HDFSHTTPSFTPThis article applies to the following connectors: Amazon S3, Azure Blob, Azure Data Lake Storage Gen2, Azure File Storage, File System, FTP, Google Cloud Storage, HDFS, HTTP, and SFTP.

Azure 数据工厂支持以下文件格式。Azure Data Factory supports the following file formats. 请参阅每一篇介绍基于格式的设置的文章。Refer to each article for format-based settings.

可以使用复制活动在两个基于文件的数据存储之间按原样复制文件,在这种情况下,无需任何序列化或反序列化即可高效复制数据。You can use the Copy activity to copy files as-is between two file-based data stores, in which case the data is copied efficiently without any serialization or deserialization.

此外,还可以分析或生成给定格式的文件。In addition, you can also parse or generate files of a given format. 例如,可以执行以下步骤:For example, you can perform the following:

  • 从 SQL Server 数据库复制数据,并将数据以 Parquet 格式写入 Azure Data Lake Storage Gen2。Copy data from a SQL Server database and write to Azure Data Lake Storage Gen2 in Parquet format.
  • 从本地文件系统中复制文本 (CSV) 格式文件,并将其以 Avro 格式写入 Azure Blob 存储。Copy files in text (CSV) format from an on-premises file system and write to Azure Blob storage in Avro format.
  • 从本地文件系统复制压缩文件,动态解压缩,然后将提取的文件写入 Azure Data Lake Storage Gen2。Copy zipped files from an on-premises file system, decompress them on-the-fly, and write extracted files to Azure Data Lake Storage Gen2.
  • 从 Azure Blob 存储复制 Gzip 压缩文本 (CSV) 格式的数据,并将其写入 Azure SQL 数据库。Copy data in Gzip compressed-text (CSV) format from Azure Blob storage and write it to Azure SQL Database.
  • 需要序列化/反序列化或压缩/解压缩的其他许多活动。Many more activities that require serialization/deserialization or compression/decompression.

后续步骤Next steps

请参阅其他复制活动文章:See the other Copy Activity articles: