使用 Azure 数据工厂从 SAP 表复制数据Copy data from an SAP table by using Azure Data Factory

适用于: Azure 数据工厂 Azure Synapse Analytics

本文概述如何使用 Azure 数据工厂中的复制活动从 SAP 表复制数据。This article outlines how to use the copy activity in Azure Data Factory to copy data from an SAP table. 有关详细信息,请参阅复制活动概述For more information, see Copy activity overview.

提示

若要了解 ADF 对 SAP 数据集成方案的总体支持,请参阅使用 Azure 数据工厂进行 SAP 数据集成白皮书,其中包含每个 SAP 连接器的详细介绍、比较和指导。To learn ADF's overall support on SAP data integration scenario, see SAP data integration using Azure Data Factory whitepaper with detailed introduction on each SAP connector, comparsion and guidance.

支持的功能Supported capabilities

以下活动支持此 SAP 表连接器:This SAP table connector is supported for the following activities:

可以将数据从 SAP 表复制到任何受支持的接收器数据存储。You can copy data from an SAP table to any supported sink data store. 有关复制活动支持作为源或接收器的数据存储列表,请参阅支持的数据存储表。For a list of the data stores that are supported as sources or sinks by the copy activity, see the Supported data stores table.

具体而言,此 SAP 表连接器支持:Specifically, this SAP table connector supports:

  • 在以下产品中从 SAP 表复制数据:Copying data from an SAP table in:

    • SAP ERP 中心组件 (SAP ECC) 7.01 或更高版本(包含在 2015 年之后发布的最新 SAP 支持包堆栈中)。SAP ERP Central Component (SAP ECC) version 7.01 or later (in a recent SAP Support Package Stack released after 2015).
    • SAP Business Warehouse (SAP BW) 版本 7.01 或更高版本(位于最新的 SAP 支持包堆栈中,该堆栈是 2015 年以后发布的)。SAP Business Warehouse (SAP BW) version 7.01 or later (in a recent SAP Support Package Stack released after 2015).
    • SAP S/4HANA。SAP S/4HANA.
    • SAP Business Suite 版本 7.01 或更高版本(位于最新的 SAP 支持包堆栈中,该堆栈是 2015 年以后发布的)中的其他产品。Other products in SAP Business Suite version 7.01 or later (in a recent SAP Support Package Stack released after 2015).
  • 从 SAP 透明表、共用表、聚集表和视图复制数据。Copying data from both an SAP transparent table, a pooled table, a clustered table, and a view.

  • 使用基本身份验证或安全网络通信 (SNC)(如果已配置 SNC)复制数据。Copying data by using basic authentication or Secure Network Communications (SNC), if SNC is configured.

  • 连接到 SAP 应用程序服务器或 SAP 消息服务器。Connecting to an SAP application server or SAP message server.

  • 通过默认或自定义 RFC 检索数据。Retrieving data via default or custom RFC.

版本 7.01 或更高版本指 SAP NetWeaver 版本,而不是 SAP ECC 版本。The version 7.01 or later refers to SAP NetWeaver version instead of SAP ECC version. 例如,SAP ECC 6.0 EHP 7 的 NetWeaver 版本一般 >=7.4。For example, SAP ECC 6.0 EHP 7 in general has NetWeaver version >=7.4. 如果你不确定自己的环境,请在 SAP 系统中执行以下步骤来确认版本:In case you are unsure about your environment, here are the steps to confirm the version from your SAP system:

  1. 使用 SAP GUI 连接到 SAP 系统。Use SAP GUI to connect to the SAP System.
  2. 转到“系统” -> “状态” 。Go to System -> Status.
  3. 检查 SAP_BASIS 的版本,确保它等于或大于 701。Check the release of the SAP_BASIS, ensure it is equal to or larger than 701.
    检查 SAP_BASIS

先决条件Prerequisites

若要使用此 SAP 表连接器,需要:To use this SAP table connector, you need to:

  • 安装自承载集成运行时(3.17 或更高版本)。Set up a self-hosted integration runtime (version 3.17 or later). 有关详细信息,请参阅创建和配置自承载集成运行时For more information, see Create and configure a self-hosted integration runtime.

  • 从 SAP 网站下载 64 位 SAP Connector for Microsoft .NET 3.0,并将其安装到自承载集成运行时计算机上。Download the 64-bit SAP Connector for Microsoft .NET 3.0 from SAP's website, and install it on the self-hosted integration runtime machine. 安装期间,请在“可选安装步骤”窗口中确保选择“将程序集安装到 GAC”选项。 During installation, make sure you select the Install Assemblies to GAC option in the Optional setup steps window.

    安装 SAP Connector for .NET

  • 在数据工厂 SAP 表连接器中使用的 SAP 用户必须拥有以下权限:The SAP user who's being used in the Data Factory SAP table connector must have the following permissions:

    • 使用远程函数调用 (RFC) 目标的授权。Authorization for using Remote Function Call (RFC) destinations.
    • 对 S_SDSAUTH 授权对象拥有“执行”活动权限。Permissions to the Execute activity of the S_SDSAUTH authorization object. 可以在大多数授权对象上参考 SAP 说明 40089。You can refer to SAP Note 40089 on the majority authorization objects. 基础 NCo 连接器需要某些 RFC,例如 RFC_FUNCTION_SEARCH。Certain RFCs are required by the underlying NCo connector, for example RFC_FUNCTION_SEARCH.

入门Get started

若要使用管道执行复制活动,可以使用以下工具或 SDK 之一:To perform the Copy activity with a pipeline, you can use one of the following tools or SDKs:

以下部分详细介绍了用于定义特定于 SAP 表连接器的数据工厂实体的属性。The following sections provide details about properties that are used to define the Data Factory entities specific to the SAP table connector.

链接服务属性Linked service properties

SAP BW Open Hub 链接服务支持以下属性:The following properties are supported for the SAP BW Open Hub linked service:

属性Property 描述Description 必须Required
type type 属性必须设置为 SapTableThe type property must be set to SapTable. Yes
server SAP 实例所在的服务器的名称。The name of the server on which the SAP instance is located.
用于连接到 SAP 应用程序服务器。Use to connect to an SAP application server.
No
systemNumber SAP 系统的系统编号。The system number of the SAP system.
用于连接到 SAP 应用程序服务器。Use to connect to an SAP application server.
允许的值:用字符串表示的两位十进制数。Allowed value: A two-digit decimal number represented as a string.
No
messageServer SAP 消息服务器的主机名。The host name of the SAP message server.
用于连接到 SAP 消息服务器。Use to connect to an SAP message server.
No
messageServerService 消息服务器的服务名称或端口号。The service name or port number of the message server.
用于连接到 SAP 消息服务器。Use to connect to an SAP message server.
No
systemId 表所在的 SAP 系统的 ID。The ID of the SAP system where the table is located.
用于连接到 SAP 消息服务器。Use to connect to an SAP message server.
No
logonGroup SAP 系统的登录组。The logon group for the SAP system.
用于连接到 SAP 消息服务器。Use to connect to an SAP message server.
No
clientId SAP 系统中客户端的 ID。The ID of the client in the SAP system.
允许的值:用字符串表示的三位十进制数。Allowed value: A three-digit decimal number represented as a string.
Yes
language SAP 系统使用的语言。The language that the SAP system uses.
默认值为 ENDefault value is EN.
No
userName 有权访问 SAP 服务器的用户名。The name of the user who has access to the SAP server. Yes
password 用户的密码。The password for the user. 使用 SecureString 类型标记此字段可以安全地将其存储在数据工厂中,或引用存储在 Azure Key Vault 中的机密Mark this field with the SecureString type to store it securely in Data Factory, or reference a secret stored in Azure Key Vault. Yes
sncMode 用于访问表所在的 SAP 服务器的 SNC 激活指示符。The SNC activation indicator to access the SAP server where the table is located.
若要使用 SNC 连接到 SAP 服务器,可以使用此属性。Use if you want to use SNC to connect to the SAP server.
允许的值为 0(关闭,默认值)或 1(打开)。Allowed values are 0 (off, the default) or 1 (on).
No
sncMyName 用于访问表所在的 SAP 服务器的发起方 SNC 名称。The initiator's SNC name to access the SAP server where the table is located.
sncMode 打开时适用。Applies when sncMode is on.
No
sncPartnerName 用于访问表所在的 SAP 服务器的通信合作伙伴 SNC 名称。The communication partner's SNC name to access the SAP server where the table is located.
sncMode 打开时适用。Applies when sncMode is on.
No
sncLibraryPath 用于访问表所在的 SAP 服务器的外部安全产品库。The external security product's library to access the SAP server where the table is located.
sncMode 打开时适用。Applies when sncMode is on.
No
sncQop 要应用的保护级别的 SNC 质量。The SNC Quality of Protection level to apply.
sncMode 打开时适用。Applies when sncMode is On.
允许的值为 1(身份验证)、2(完整性)、3(隐私)、8(默认值)和 9(最大值)。Allowed values are 1 (Authentication), 2 (Integrity), 3 (Privacy), 8 (Default), 9 (Maximum).
No
connectVia 用于连接到数据存储的集成运行时The integration runtime to be used to connect to the data store. 如前面的先决条件中所述,需要安装自承载集成运行时。A self-hosted integration runtime is required, as mentioned earlier in Prerequisites. Yes

示例 1:连接到 SAP 应用程序服务器Example 1: Connect to an SAP application server

{
    "name": "SapTableLinkedService",
    "properties": {
        "type": "SapTable",
        "typeProperties": {
            "server": "<server name>",
            "systemNumber": "<system number>",
            "clientId": "<client ID>",
            "userName": "<SAP user>",
            "password": {
                "type": "SecureString",
                "value": "<Password for SAP user>"
            }
        },
        "connectVia": {
            "referenceName": "<name of Integration Runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

示例 2:连接到 SAP 消息服务器Example 2: Connect to an SAP message server

{
    "name": "SapTableLinkedService",
    "properties": {
        "type": "SapTable",
        "typeProperties": {
            "messageServer": "<message server name>",
            "messageServerService": "<service name or port>",
            "systemId": "<system ID>",
            "logonGroup": "<logon group>",
            "clientId": "<client ID>",
            "userName": "<SAP user>",
            "password": {
                "type": "SecureString",
                "value": "<Password for SAP user>"
            }
        },
        "connectVia": {
            "referenceName": "<name of integration runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

示例 3:使用 SNC 进行连接Example 3: Connect by using SNC

{
    "name": "SapTableLinkedService",
    "properties": {
        "type": "SapTable",
        "typeProperties": {
            "server": "<server name>",
            "systemNumber": "<system number>",
            "clientId": "<client ID>",
            "userName": "<SAP user>",
            "password": {
                "type": "SecureString",
                "value": "<Password for SAP user>"
            },
            "sncMode": 1,
            "sncMyName": "<SNC myname>",
            "sncPartnerName": "<SNC partner name>",
            "sncLibraryPath": "<SNC library path>",
            "sncQop": "8"
        },
        "connectVia": {
            "referenceName": "<name of integration runtime>",
            "type": "IntegrationRuntimeReference"
        }
    }
}

数据集属性Dataset properties

有关用于定义数据集的各个部分和属性的完整列表,请参阅数据集For a full list of the sections and properties for defining datasets, see Datasets. 以下部分提供 SAP 表数据集支持的属性的列表。The following section provides a list of the properties supported by the SAP table dataset.

支持使用以下属性从/向 SAP BW Open Hub 链接服务复制数据。To copy data from and to the SAP BW Open Hub linked service, the following properties are supported:

属性Property 描述Description 必须Required
type type 属性必须设置为 SapTableResourceThe type property must be set to SapTableResource. Yes
tableName 要从中复制数据的 SAP 表的名称。The name of the SAP table to copy data from. Yes

示例Example

{
    "name": "SAPTableDataset",
    "properties": {
        "type": "SapTableResource",
        "typeProperties": {
            "tableName": "<SAP table name>"
        },
        "schema": [],
        "linkedServiceName": {
            "referenceName": "<SAP table linked service name>",
            "type": "LinkedServiceReference"
        }
    }
}

复制活动属性Copy activity properties

有关用于定义活动的各部分和属性的完整列表,请参阅管道For a full list of the sections and properties for defining activities, see Pipelines. 以下部分提供 SAP 表源支持的属性的列表。The following section provides a list of the properties supported by the SAP table source.

SAP 表作为源SAP table as source

支持使用以下属性从 SAP 表复制数据:To copy data from an SAP table, the following properties are supported:

属性Property 描述Description 必须Required
type type 属性必须设置为 SapTableSourceThe type property must be set to SapTableSource. Yes
rowCount 要检索的行数。The number of rows to be retrieved. No
rfcTableFields 要从 SAP 表复制的字段(列)。The fields (columns) to copy from the SAP table. 例如,column0, column1For example, column0, column1. No
rfcTableOptions 用于筛选 SAP 表中的行的选项。The options to filter the rows in an SAP table. 例如,COLUMN0 EQ 'SOMEVALUE'For example, COLUMN0 EQ 'SOMEVALUE'. 另请参阅本文稍后提供的 SAP 查询运算符表。See also the SAP query operator table later in this article. No
customRfcReadTableFunctionModule 可用于从 SAP 表读取数据的自定义 RFC 函数模块。A custom RFC function module that can be used to read data from an SAP table.
可以使用自定义 RFC 函数模块来定义如何从 SAP 系统检索数据并将其返回到数据工厂。You can use a custom RFC function module to define how the data is retrieved from your SAP system and returned to Data Factory. 必须为自定义函数模块实现一个接口(导入、导出、表),类似于数据工厂使用的默认接口 /SAPDS/RFC_READ_TABLE2The custom function module must have an interface implemented (import, export, tables) that's similar to /SAPDS/RFC_READ_TABLE2, which is the default interface used by Data Factory.
数据工厂Data Factory
No
partitionOption 要从 SAP 表中读取的分区机制。The partition mechanism to read from an SAP table. 支持的选项包括:The supported options include:
  • None
  • PartitionOnInt(在左侧用零填充正常整数或整数值,例如 0000012345PartitionOnInt (normal integer or integer values with zero padding on the left, such as 0000012345)
  • PartitionOnCalendarYear(采用“YYYY”格式的 4 位数)PartitionOnCalendarYear (4 digits in the format "YYYY")
  • PartitionOnCalendarMonth(采用“YYYYMM”格式的 6 位数)PartitionOnCalendarMonth (6 digits in the format "YYYYMM")
  • PartitionOnCalendarDate(采用“YYYYMMDD”格式的 8 位数)PartitionOnCalendarDate (8 digits in the format "YYYYMMDD")
  • PartitionOntime(采用“HHMMSS”格式的 6 位数,例如 235959PartitionOntime (6 digits in the format "HHMMSS", such as 235959)
No
partitionColumnName 用于将数据分区的列的名称。The name of the column used to partition the data. No
partitionUpperBound partitionColumnName 中指定的用于继续分区的列的最大值。The maximum value of the column specified in partitionColumnName that will be used to continue with partitioning. No
partitionLowerBound partitionColumnName 中指定的用于继续分区的列的最小值。The minimum value of the column specified in partitionColumnName that will be used to continue with partitioning. (注意:当分区选项为 PartitionOnInt 时,partitionLowerBound 不能为“0”)(Note: partitionLowerBound cannot be "0" when partition option is PartitionOnInt) No
maxPartitionsNumber 要将数据拆分成的最大分区数。The maximum number of partitions to split the data into. No
sapDataColumnDelimiter 单个字符,将用作传递给 SAP RFC 的分隔符,以用于拆分输出数据。The single character that is used as delimiter passed to SAP RFC to split the output data. No

提示

如果 SAP 表包含大量数据(例如几十亿行),请使用 partitionOptionpartitionSetting 将数据拆分成小分区。If your SAP table has a large volume of data, such as several billion rows, use partitionOption and partitionSetting to split the data into smaller partitions. 在这种情况下,将按分区读取数据,并通过单个 RFC 调用从 SAP 服务器检索每个数据分区。In this case, the data is read per partition, and each data partition is retrieved from your SAP server via a single RFC call.

partitionOptionpartitionOnInt 为例,每个分区中的行数的计算公式为:(处于 partitionUpperBoundpartitionLowerBound 之间的总行数)/maxPartitionsNumberTaking partitionOption as partitionOnInt as an example, the number of rows in each partition is calculated with this formula: (total rows falling between partitionUpperBound and partitionLowerBound)/maxPartitionsNumber.

若要并行加载数据分区以加快复制速度,并行程度由复制活动的 parallelCopies 设置控制。To load data partitions in parallel to speed up copy, the parallel degree is controlled by the parallelCopies setting on the copy activity. 例如,如果将 parallelCopies 设置为 4,则数据工厂会根据指定的分区选项和设置并行生成并运行 4 个查询,每个查询从 SAP 表检索一部分数据。For example, if you set parallelCopies to four, Data Factory concurrently generates and runs four queries based on your specified partition option and settings, and each query retrieves a portion of data from your SAP table. 强烈建议将 maxPartitionsNumber 设为 parallelCopies 属性值的倍数。We strongly recommend making maxPartitionsNumber a multiple of the value of the parallelCopies property. 将数据复制到基于文件的数据存储中时,还建议将数据作为多个文件写入文件夹(仅指定文件夹名称),在这种情况下,性能优于写入单个文件。When copying data into file-based data store, it's also recommanded to write to a folder as multiple files (only specify folder name), in which case the performance is better than writing to a single file.

提示

默认情况下,已在 Azure 数据工厂端为此 SAP 表连接器启用 BASXMLThe BASXML is enabled by default for this SAP Table connector on Azure Data Factory side.

rfcTableOptions 中,可以使用以下常用 SAP 查询运算符来筛选行:In rfcTableOptions, you can use the following common SAP query operators to filter the rows:

运算符Operator 说明Description
EQ 等于Equal to
NE 不等于Not equal to
LT 小于Less than
LE 小于或等于Less than or equal to
GT 大于Greater than
GE 大于或等于Greater than or equal to
IN 例如 TABCLASS IN ('TRANSP', 'INTTAB')As in TABCLASS IN ('TRANSP', 'INTTAB')
LIKE 例如 LIKE 'Emma%'As in LIKE 'Emma%'

示例Example

"activities":[
    {
        "name": "CopyFromSAPTable",
        "type": "Copy",
        "inputs": [
            {
                "referenceName": "<SAP table input dataset name>",
                "type": "DatasetReference"
            }
        ],
        "outputs": [
            {
                "referenceName": "<output dataset name>",
                "type": "DatasetReference"
            }
        ],
        "typeProperties": {
            "source": {
                "type": "SapTableSource",
                "partitionOption": "PartitionOnInt",
                "partitionSettings": {
                     "partitionColumnName": "<partition column name>",
                     "partitionUpperBound": "2000",
                     "partitionLowerBound": "1",
                     "maxPartitionsNumber": 500
                 }
            },
            "sink": {
                "type": "<sink type>"
            },
            "parallelCopies": 4
        }
    }
]

联接 SAP 表Join SAP tables

SAP 表连接器目前仅支持单个具有默认函数模块的表。Currently SAP Table connector only supports one single table with the default function module. 若要获取多个表的联接数据,可以按照以下步骤利用 SAP 表连接器中的 customRfcReadTableFunctionModule 属性。To get the joined data of multiple tables, you can leverage the customRfcReadTableFunctionModule property in the SAP Table connector following steps below:

  • 编写自定义函数模块,该模块可将查询作为 OPTIONS,并应用你自己的逻辑来检索数据。Write a custom function module, which can take a query as OPTIONS and apply your own logic to retrieve the data.
  • 对于“自定义函数模块”,请输入自定义函数模块的名称。For the "Custom function module", enter the name of your custom function module.
  • 对于“RFC 表选项”,请指定要作为 OPTIONS 提供给函数模块的表联接语句,例如“<TABLE1> INNER JOIN <TABLE2> ON COLUMN0”。For the "RFC table options", specify the table join statement to feed into your function module as OPTIONS, such as "<TABLE1> INNER JOIN <TABLE2> ON COLUMN0".

下面是一个示例:Below is an example:

SAP 表联接

提示

还可考虑将联接数据聚合在 VIEW 中,这受 SAP 表连接器支持。You can also consider having the joined data aggregated in the VIEW, which is supported by SAP Table connector. 还可尝试将相关的表提取并载入到 Azure 上(例如,Azure 存储和 Azure SQL 数据库),然后使用数据流进行进一步联接或筛选。You can also try to extract related tables to get onboard onto Azure (e.g. Azure Storage, Azure SQL Database), then use Data Flow to proceed with further join or filter.

创建自定义函数模块Create custom function module

对于 SAP 表,我们目前支持复制源中的 customRfcReadTableFunctionModule 属性,这允许你利用自己的逻辑和流程数据。For SAP table, currently we support customRfcReadTableFunctionModule property in the copy source, which allows you to leverage your own logic and process data.

作为快速入门指南,以下是开始使用“自定义函数模块”的一些要求:As a quick guidance, here are some requirements to get started with the "Custom function module":

  • 定义:Definition:

    定义

  • 将数据导出到下表之一:Export data into one of the tables below:

    导出表 1

    导出表 2

以下是有关 SAP 表连接器如何与自定义函数模块一起工作的说明:Below are illustrations of how SAP table connector works with custom function module:

  1. 通过 SAP NCO 与 SAP 服务器建立连接。Build connection with SAP server via SAP NCO.

  2. 调用“自定义函数模块”,并将参数设置如下:Invoke "Custom function module" with the parameters set as below:

    • QUERY_TABLE:在 ADF SAP 表数据集中设置的表名称;QUERY_TABLE: the table name you set in the ADF SAP Table dataset;
    • 分隔符:在 ADF SAP 表源中设置的分隔符;Delimiter: the delimiter you set in the ADF SAP Table Source;
    • 行计数/选项/字段:在 ADF 表源中设置的行计数/聚合选项/字段。ROWCOUNT/Option/Fields: the Rowcount/Aggregated Option/Fields you set in the ADF Table source.
  3. 获取结果并按以下方式分析数据:Get the result and parse the data in below ways:

    1. 分析字段表中的值以获取架构。Parse the value in the Fields table to get the schemas.

      分析字段中的值

    2. 获取输出表中的值,以查看哪个表包含这些值。Get the values of the output table to see which table contains these values.

      获取输出表中的值

    3. 获取 OUT_TABLE 中的值,分析数据,然后将其写入接收器。Get the values in the OUT_TABLE, parse the data and then write it into the sink.

SAP 表的数据类型映射Data type mappings for an SAP table

从 SAP 表复制数据时,以下映射用于从 SAP 表数据类型映射到 Azure 数据工厂临时数据类型。When you're copying data from an SAP table, the following mappings are used from the SAP table data types to the Azure Data Factory interim data types. 若要了解复制活动如何将源架构和数据类型映射到接收器,请参阅架构和数据类型映射To learn how the copy activity maps the source schema and data type to the sink, see Schema and data type mappings.

SAP ABAP 类型SAP ABAP Type 数据工厂临时数据类型Data Factory interim data type
C(字符串)C (String) String
I(整数)I (Integer) Int32
F(浮点数)F (Float) Double
D(日期)D (Date) String
T(时间)T (Time) String
P(BCD Packed、货币、小数、数量)P (BCD Packed, Currency, Decimal, Qty) Decimal
N(数字)N (Numeric) String
X(二进制和原始数据)X (Binary and Raw) String

Lookup 活动属性Lookup activity properties

若要了解有关属性的详细信息,请查看 Lookup 活动To learn details about the properties, check Lookup activity.

后续步骤Next steps

有关 Azure 数据工厂中复制活动支持作为源和接收器的数据存储的列表,请参阅支持的数据存储For a list of the data stores supported as sources and sinks by the copy activity in Azure Data Factory, see Supported data stores.