将映射数据流参数化Parameterizing mapping data flows

适用于: Azure 数据工厂 Azure Synapse Analytics

Azure 数据工厂中的映射数据流支持使用参数。Mapping data flows in Azure Data Factory support the use of parameters. 定义数据流定义内的参数并在整个表达式中使用它们。Define parameters inside of your data flow definition and use them throughout your expressions. 参数值由调用管道通过“执行数据流”活动设置。The parameter values are set by the calling pipeline via the Execute Data Flow activity. 你可以使用三个选项来设置数据流活动表达式中的值:You have three options for setting the values in the data flow activity expressions:

  • 使用管道控制流表达式语言来设置动态值Use the pipeline control flow expression language to set a dynamic value
  • 使用数据流表达式语言来设置动态值Use the data flow expression language to set a dynamic value
  • 使用任一表达式语言来设置静态文本值Use either expression language to set a static literal value

使用此功能可以使数据流具有通用性、灵活性和可重用性。Use this capability to make your data flows general-purpose, flexible, and reusable. 可以使用这些参数将数据流设置和表达式参数化。You can parameterize data flow settings and expressions with these parameters.

在映射数据流中创建参数Create parameters in a mapping data flow

若要将参数添加到数据流,请单击数据流画布的空白部分,查看常规属性。To add parameters to your data flow, click on the blank portion of the data flow canvas to see the general properties. 在设置窗格中,你将看到名为“参数”的选项卡。In the settings pane, you will see a tab called Parameter. 选择“新建”,生成新参数。Select New to generate a new parameter. 对于每个参数,必须分配名称,选择类型,然后根据需要设置默认值。For each parameter, you must assign a name, select a type, and optionally set a default value.

创建数据流参数Create Data Flow parameters

在映射数据流中使用参数Use parameters in a mapping data flow

可以在任意数据流表达式中引用参数。Parameters can be referenced in any data flow expression. 参数以 $ 开头,并且不可改变。Parameters begin with $ and are immutable. 可以在“参数”选项卡下发现表达式生成器内的可用参数的列表。You will find the list of available parameters inside of the Expression Builder under the Parameters tab.

屏幕截图显示了“参数”选项卡中的可用参数。Screenshot shows the available parameters in the Parameters tab.

可以通过选择“新建参数”并指定名称和类型来快速添加更多参数。You can quickly add additional parameters by selecting New parameter and specifying the name and type.

屏幕截图显示了“参数”选项卡中的参数以及新添加的参数。Screenshot shows the parameters in the Parameters tab with new parameters added.

从管道分配参数值Assign parameter values from a pipeline

使用参数创建数据流后,可以通过“执行数据流活动”从管道执行该数据流。Once you've created a data flow with parameters, you can execute it from a pipeline with the Execute Data Flow Activity. 将活动添加到管道画布后,活动的“参数”选项卡中将显示可用的数据流参数。After you add the activity to your pipeline canvas, you will be presented with the available data flow parameters in the activity's Parameters tab.

分配参数值时,可以根据 Spark 类型使用管道表达式语言数据流表达式语言When assigning parameter values, you can use either the pipeline expression language or the data flow expression language based on spark types. 所有映射数据流都可以具有管道和数据流表达式参数的任意组合。Each mapping data flow can have any combination of pipeline and data flow expression parameters.

屏幕截图显示了“参数”选项卡,其中为 myparam 的值选择了“数据流表达式”。Screenshot shows the Parameters tab with Data Flow expression selected for the value of myparam.

管道表达式参数Pipeline expression parameters

使用管道表达式参数,可以引用系统变量、函数、管道参数以及类似于其他管道活动的变量。Pipeline expression parameters allow you to reference system variables, functions, pipeline parameters, and variables similar to other pipeline activities. 单击“管道表达式”可打开侧导航栏,可以通过该导航栏使用表达式生成器输入表达式。When you click Pipeline expression, a side-nav will open allowing you to enter an expression using the expression builder.

屏幕截图显示了表达式生成器窗格。Screenshot shows the expression builder pane.

引用管道参数时,系统将计算该参数的值,然后在数据流表达式语言中使用其值。When referenced, pipeline parameters are evaluated and then their value is used in the data flow expression language. 管道表达式类型不需要与数据流参数类型匹配。The pipeline expression type doesn't need to match the data flow parameter type.

字符串文本与表达式String literals vs expressions

分配字符串类型的管道表达式参数时,默认情况下将添加引号,并将值作为文本评估。When assigning a pipeline expression parameter of type string, by default quotes will be added and the value will be evaluated as a literal. 若要将参数值作为数据流表达式读取,请勾选参数旁的表达式框。To read the parameter value as a data flow expression, check the expression box next to the parameter.

屏幕截图显示了“数据流参数”窗格,其中为参数选择了“表达式”。Screenshot shows the Data flow parameters pane Expression selected for a parameter.

对于数据流参数,stringParam 引用值为 upper(column1) 的管道参数。If data flow parameter stringParam references a pipeline parameter with value upper(column1).

  • 如果勾选了表达式,$stringParam 的计算结果为 column1 的值(全部大写)。If expression is checked, $stringParam evaluates to the value of column1 all uppercase.
  • 如果未勾选表达式(默认行为),择 $stringParam 的计算结果为 'upper(column1)'If expression is not checked (default behavior), $stringParam evaluates to 'upper(column1)'

传入时间戳Passing in timestamps

在管道表达式语言中,pipeline().TriggerTime 等系统变量和 utcNow() 等函数以“yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ”格式的字符串形式返回时间戳。In the pipeline expression language, System variables such as pipeline().TriggerTime and functions like utcNow() return timestamps as strings in format 'yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ'. 若要将其转换为时间戳类型的数据流参数,请使用字符串内插在 toTimestamp() 函数中加入所需时间戳。To convert these into data flow parameters of type timestamp, use string interpolation to include the desired timestamp in a toTimestamp() function. 例如,若要将管道触发时间转换为数据流参数,可以使用 toTimestamp(left('@{pipeline().TriggerTime}', 23), 'yyyy-MM-dd\'T\'HH:mm:ss.SSS')For example, to convert the pipeline trigger time into a data flow parameter, you can use toTimestamp(left('@{pipeline().TriggerTime}', 23), 'yyyy-MM-dd\'T\'HH:mm:ss.SSS').

屏幕截图显示了可在其中输入触发时间的“参数”选项卡。Screenshot shows the Parameters tab where you can enter a trigger time.

备注

数据流最多支持表示 3 毫秒的数字。Data Flows can only support up to 3 millisecond digits. left() 函数用于裁剪超出部分的数字。The left() function is used trim off additional digits.

管道参数示例Pipeline parameter example

假设整数参数 intParam 引用字符串类型的管道参数 @pipeline.parameters.pipelineParamSay you have an integer parameter intParam that is referencing a pipeline parameter of type String, @pipeline.parameters.pipelineParam.

屏幕截图显示了“参数”选项卡,其中包含名为 stringParam 和 intParam 的参数。Screenshot shows the Parameters tab with parameters named stringParam and intParam.

在运行时为 @pipeline.parameters.pipelineParam 分配值 abs(1)@pipeline.parameters.pipelineParam is assigned a value of abs(1) at runtime.

屏幕截图显示了“参数”选项卡,其中选择了 a b s (1) 的值。Screenshot shows the Parameters tab with the value of a b s (1) selected.

在派生列等表达式中引用 $intParam 时,它将计算 abs(1) 的结果,并返回 1When $intParam is referenced in an expression such as a derived column, it will evaluate abs(1) return 1.

屏幕截图显示了列值。Screenshot shows the columns value.

数据流表达式参数Data flow expression parameters

选择“数据流表达式”可打开数据流表达式生成器。Select Data flow expression will open up the data flow expression builder. 你将能够在整个数据流中引用函数、其他参数和任何已定义的架构列。You will be able to reference functions, other parameters and any defined schema column throughout your data flow. 在引用此表达式时,系统将计算其值。This expression will be evaluated as is when referenced.

备注

如果传入的表达式无效,或引用该转换中不存在的架构列,则参数的计算结果为 null。If you pass in an invalid expression or reference a schema column that doesn't exist in that transformation, the parameter will evaluate to null.

以参数的形式传入列名Passing in a column name as a parameter

通常将列名作为参数值传入。A common pattern is to pass in a column name as a parameter value. 如果在数据流架构中定义了列,则可以直接将其作为字符串表达式引用。If the column is defined in the data flow schema, you can reference it directly as a string expression. 如果未在架构中定义列,请使用 byName() 函数。If the column isn't defined in the schema, use the byName() function. 请记住使用 toString() 等转换函数将列转换为适当的类型。Remember to cast the column to its appropriate type with a casting function such as toString().

例如,如果要基于参数 columnName 映射字符串列,可以添加派生列转换 toString(byName($columnName))For example, if you wanted to map a string column based upon a parameter columnName, you can add a derived column transformation equal to toString(byName($columnName)).

以参数的形式传入列名Passing in a column name as a parameter

后续步骤Next steps