映射数据流中的数据转换表达式Data transformation expressions in mapping data flow

适用于: Azure 数据工厂 Azure Synapse Analytics

表达式函数Expression functions

在数据工厂中,使用映射数据流功能的表达式语言来配置数据转换。In Data Factory, use the expression language of the mapping data flow feature to configure data transformations.


abs

abs(<value1> : number) => number

数字的绝对值。Absolute value of a number.

  • abs(-20) -> 20
  • abs(10) -> 10

acos

acos(<value1> : number) => double

计算反余弦值。Calculates a cosine inverse value.

  • acos(1) -> 0.0

add

add(<value1> : any, <value2> : any) => any

添加一对字符串或数字。Adds a pair of strings or numbers. 将日期添加到日期数。Adds a date to a number of days. 将持续时间添加到时间戳。Adds a duration to a timestamp. 将类似类型的数组追加到另一个数组。Appends one array of similar type to another. 与 + 运算符相同。Same as the + operator.

  • add(10, 20) -> 30
  • 10 + 20 -> 30
  • add('ice', 'cream') -> 'icecream'
  • 'ice' + 'cream' + ' cone' -> 'icecream cone'
  • add(toDate('2012-12-12'), 3) -> toDate('2012-12-15')
  • toDate('2012-12-12') + 3 -> toDate('2012-12-15')
  • [10, 20] + [30, 40] -> [10, 20, 30, 40]
  • toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) -> toTimestamp('2019-02-04 07:19:18.871', 'yyyy-MM-dd HH:mm:ss.SSS')

addDays

addDays(<date/timestamp> : datetime, <days to add> : integral) => datetime

将日期添加到日期或时间戳。Add days to a date or timestamp. 与日期的 + 运算符相同。Same as the + operator for date.

  • addDays(toDate('2016-08-08'), 1) -> toDate('2016-08-09')

addMonths

addMonths(<date/timestamp> : datetime, <months to add> : integral, [<value3> : string]) => datetime

将月份添加到日期或时间戳。Add months to a date or timestamp. 可以选择性地传递时区。You can optionally pass a timezone.

  • addMonths(toDate('2016-08-31'), 1) -> toDate('2016-09-30')
  • addMonths(toTimestamp('2016-09-30 10:10:10'), -1) -> toTimestamp('2016-08-31 10:10:10')

and

and(<value1> : boolean, <value2> : boolean) => boolean

“逻辑与”运算符。Logical AND operator. 与 && 相同。Same as &&.

  • and(true, false) -> false
  • true && false -> false

asin

asin(<value1> : number) => double

计算反正弦值。Calculates an inverse sine value.

  • asin(0) -> 0.0

atan

atan(<value1> : number) => double

计算反正切值。Calculates a inverse tangent value.

  • atan(0) -> 0.0

atan2

atan2(<value1> : number, <value2> : number) => double

返回平面正 X 轴与给定坐标点之间的弧角。Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates.

  • atan2(0, 0) -> 0.0

between

between(<value1> : any, <value2> : any, <value3> : any) => boolean

检查第一个值是否在两个其他值之间(含这两个值)。Checks if the first value is in between two other values inclusively. 可以比较数字、字符串和日期值 * between(10, 5, 24) * true * between(currentDate(), currentDate() + 10, currentDate() + 20) * false


bitwiseAnd

bitwiseAnd(<value1> : integral, <value2> : integral) => integralNumeric, string and datetime values can be compared * between(10, 5, 24) * true * between(currentDate(), currentDate() + 10, currentDate() + 20) * false


bitwiseAnd

bitwiseAnd(<value1> : integral, <value2> : integral) => integral

跨整数类型的位和运算符。Bitwise And operator across integral types. 与 & 运算符相同 * bitwiseAnd(0xf4, 0xef) * 0xe4 * (0xf4 & 0xef) * 0xe4


bitwiseOr

bitwiseOr(<value1> : integral, <value2> : integral) => integralSame as & operator * bitwiseAnd(0xf4, 0xef) * 0xe4 * (0xf4 & 0xef) * 0xe4


bitwiseOr

bitwiseOr(<value1> : integral, <value2> : integral) => integral

跨整数类型的位或运算符。Bitwise Or operator across integral types. 与 | 运算符相同 * bitwiseOr(0xf4, 0xef) * 0xff * (0xf4 | 0xef) * 0xff


bitwiseXor

bitwiseXor(<value1> : any, <value2> : any) => anySame as | operator * bitwiseOr(0xf4, 0xef) * 0xff * (0xf4 | 0xef) * 0xff


bitwiseXor

bitwiseXor(<value1> : any, <value2> : any) => any

跨整数类型的位或运算符。Bitwise Or operator across integral types. 与 | 运算符相同 * bitwiseXor(0xf4, 0xef) * 0x1b * (0xf4 ^ 0xef) * 0x1b * (true ^ false) * true * (true ^ true) * false


blake2b

blake2b(<value1> : integer, <value2> : any, ...) => stringSame as | operator * bitwiseXor(0xf4, 0xef) * 0x1b * (0xf4 ^ 0xef) * 0x1b * (true ^ false) * true * (true ^ true) * false


blake2b

blake2b(<value1> : integer, <value2> : any, ...) => string

如果给定位长只能是 8 和 512 之间的 8 的倍数,计算具有不同基元数据类型的列集的 Blake2 摘要。Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. 它可用于计算行的指纹 * blake2b(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) * 'c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d'


blake2bBinary

blake2bBinary(<value1> : integer, <value2> : any, ...) => binaryIt can be used to calculate a fingerprint for a row * blake2b(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) * 'c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d'


blake2bBinary

blake2bBinary(<value1> : integer, <value2> : any, ...) => binary

如果给定位长只能是 8 和 512 之间的 8 的倍数,计算具有不同基元数据类型的列集的 Blake2 摘要。Calculates the Blake2 digest of set of column of varying primitive datatypes given a bit length which can only be multiples of 8 between 8 & 512. 它可用于计算行的指纹 * blake2bBinary(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) * unHex('c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d')


case

case(<condition> : boolean, <true_expression> : any, <false_expression> : any, ...) => anyIt can be used to calculate a fingerprint for a row * blake2bBinary(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) * unHex('c9521a5080d8da30dffb430c50ce253c345cc4c4effc315dab2162dac974711d')


case

case(<condition> : boolean, <true_expression> : any, <false_expression> : any, ...) => any

根据备用条件应用一个值或另一个值。Based on alternating conditions applies one value or the other. 如果输入的数字是偶数,对于最后一个条件,另一个值将默认为 NULL。If the number of inputs are even, the other is defaulted to NULL for last condition.

  • case(10 + 20 == 30, 'dumbo', 'gumbo') -> 'dumbo'
  • case(10 + 20 == 25, 'bojjus', 'do' < 'go', 'gunchus') -> 'gunchus'
  • isNull(case(10 + 20 == 25, 'bojjus', 'do' > 'go', 'gunchus')) -> true
  • case(10 + 20 == 25, 'bojjus', 'do' > 'go', 'gunchus', 'dumbo') -> 'dumbo'

cbrt

cbrt(<value1> : number) => double

计算一个数的立方根。Calculates the cube root of a number.

  • cbrt(8) -> 2.0

ceil

ceil(<value1> : number) => number

返回不小于该数字的最小整数。Returns the smallest integer not smaller than the number.

  • ceil(-0.1) -> 0

coalesce

coalesce(<value1> : any, ...) => any

返回一组输入中的第一个非 NULL 值。Returns the first not null value from a set of inputs. 所有输入应属于同一类型。All inputs should be of the same type.

  • coalesce(10, 20) -> 10
  • coalesce(toString(null), toString(null), 'dumbo', 'bo', 'go') -> 'dumbo'

columnNames

columnNames(<value1> : string) => array

获取流的所有输出列的名称。Gets the names of all output columns for a stream. 可以将可选流名称作为第二个参数传递。You can pass an optional stream name as the second argument.

  • columnNames()
  • columnNames('DeriveStream')

columns

columns([<stream name> : string]) => any

获取流的所有输出列的值。Gets the values of all output columns for a stream. 可以将可选流名称作为第二个参数传递。You can pass an optional stream name as the second argument.

  • columns()
  • columns('DeriveStream')

compare

compare(<value1> : any, <value2> : any) => integer

比较同一类型的两个值。Compares two values of the same type. 如果 value1 < value2,返回负整数;如果 value1 == value2,返回 0;如果 value1 > value2,则返回正值。Returns negative integer if value1 < value2, 0 if value1 == value2, positive value if value1 > value2.

  • (compare(12, 24) < 1) -> true
  • (compare('dumbo', 'dum') > 0) -> true

concat

concat(<this> : string, <that> : string, ...) => string

将字符串的可变数字连接在一起。Concatenates a variable number of strings together. 与包含字符串的 + 运算符相同。Same as the + operator with strings.

  • concat('dataflow', 'is', 'awesome') -> 'dataflowisawesome'
  • 'dataflow' + 'is' + 'awesome' -> 'dataflowisawesome'
  • isNull('sql' + null) -> true

concatWS

concatWS(<separator> : string, <this> : string, <that> : string, ...) => string

使用分隔符将字符串的可变数字连接在一起。Concatenates a variable number of strings together with a separator. 第一个参数是分隔符。The first parameter is the separator.

  • concatWS(' ', 'dataflow', 'is', 'awesome') -> 'dataflow is awesome'
  • isNull(concatWS(null, 'dataflow', 'is', 'awesome')) -> true
  • concatWS(' is ', 'dataflow', 'awesome') -> 'dataflow is awesome'

cos

cos(<value1> : number) => double

计算余弦值。Calculates a cosine value.

  • cos(10) -> -0.8390715290764524

cosh

cosh(<value1> : number) => double

计算某个值的双曲余弦值。Calculates a hyperbolic cosine of a value.

  • cosh(0) -> 1.0

crc32

crc32(<value1> : any, ...) => long

根据位长度计算不同基元数据类型的列集的 CRC32 哈希,值只能为 0(256)、224、256、384、512。Calculates the CRC32 hash of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. 它可用于计算行的指纹。It can be used to calculate a fingerprint for a row.

  • crc32(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 3630253689L

currentDate

currentDate([<value1> : string]) => date

当此作业开始运行时获取当前日期。Gets the current date when this job starts to run. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • currentDate() == toDate('2250-12-31') -> false
  • currentDate('PST') == toDate('2250-12-31') -> false
  • currentDate('America/New_York') == toDate('2250-12-31') -> false

currentTimestamp

currentTimestamp() => timestamp

当作业开始运行时,获取采用当地时区的当前时间戳。Gets the current timestamp when the job starts to run with local time zone.

  • currentTimestamp() == toTimestamp('2250-12-31 12:12:12') -> false

currentUTC

currentUTC([<value1> : string]) => timestamp

获取 UTC 格式的当前时间戳。Gets the current timestamp as UTC. 如果希望在与群集时区不同的时区中解释当前时间,则可以采用“GMT”、“PST”、“UTC”或“America/Cayman”格式传递一个可选时区。If you want your current time to be interpreted in a different timezone than your cluster time zone, you can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 默认为当前时区。It is defaulted to the current timezone. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html. 若要将 UTC 时间转换为其他时区,请使用 fromUTC()To convert the UTC time to a different timezone use fromUTC().

  • currentUTC() == toTimestamp('2050-12-12 19:18:12') -> false
  • currentUTC() != toTimestamp('2050-12-12 19:18:12') -> true
  • fromUTC(currentUTC(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true

dayOfMonth

dayOfMonth(<value1> : datetime) => integer

获取给定日期的月份日期。Gets the day of the month given a date.

  • dayOfMonth(toDate('2018-06-08')) -> 8

dayOfWeek

dayOfWeek(<value1> : datetime) => integer

获取给定日期的星期日期。Gets the day of the week given a date. 1 - 星期日,2 - 星期一…7 - 星期六。1 - Sunday, 2 - Monday ..., 7 - Saturday.

  • dayOfWeek(toDate('2018-06-08')) -> 6

dayOfYear

dayOfYear(<value1> : datetime) => integer

获取给定日期的年份日期。Gets the day of the year given a date.

  • dayOfYear(toDate('2016-04-09')) -> 100

days

days(<value1> : integer) => long

天数的持续时间(以毫秒为单位)。Duration in milliseconds for number of days.

  • days(2) -> 172800000L

degrees

degrees(<value1> : number) => double

将弧度转换为度。Converts radians to degrees.

  • degrees(3.141592653589793) -> 180

divide

divide(<value1> : any, <value2> : any) => any

将数字对相除。Divides pair of numbers. / 运算符相同。Same as the / operator.

  • divide(20, 10) -> 2
  • 20 / 10 -> 2

endsWith

endsWith(<string> : string, <substring to check> : string) => boolean

检查字符串是否以提供的字符串结尾。Checks if the string ends with the supplied string.

  • endsWith('dumbo', 'mbo') -> true

equals

equals(<value1> : any, <value2> : any) => boolean

“等于”比较运算符。Comparison equals operator. 与 == 运算符相同。Same as == operator.

  • equals(12, 24) -> false
  • 12 == 24 -> false
  • 'bad' == 'bad' -> true
  • isNull('good' == toString(null)) -> true
  • isNull(null == null) -> true

equalsIgnoreCase

equalsIgnoreCase(<value1> : string, <value2> : string) => boolean

忽略大小写的“等于”比较运算符。Comparison equals operator ignoring case. 与 <=> 运算符相同。Same as <=> operator.

  • 'abc'<=>'Abc' -> true
  • equalsIgnoreCase('abc', 'Abc') -> true

escape

escape(<string_to_escape> : string, <format> : string) => string

根据格式转义字符串。Escapes a string according to a format. 可接受的格式的文本值为 "json"、"xml"、"ecmascript"、"html"、"java"。Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.


expr

expr(<expr> : string) => any

从字符串生成表达式。Results in a expression from a string. 这与在非文本窗体中编写此表达式的效果一样。This is the same as writing this expression in a non-literal form. 这可以用来以字符串表示形式传递参数。This can be used to pass parameters as string representations. * expr(‘price * discount’) => any ___

factorial

factorial(<value1> : number) => long* expr(‘price * discount’) => any ___

factorial

factorial(<value1> : number) => long

计算一个数的阶乘。Calculates the factorial of a number.

  • factorial(5) -> 120

false

false() => boolean

始终返回 false 值。Always returns a false value. 如果存在名为“false”的列,则使用函数 syntax(false())Use the function syntax(false()) if there is a column named 'false'.

  • (10 + 20 > 30) -> false
  • (10 + 20 > 30) -> false()

floor

floor(<value1> : number) => number

返回不大于该数字的最大整数。Returns the largest integer not greater than the number.

  • floor(-0.1) -> -1

fromBase64

fromBase64(<value1> : string) => string

以 base64 编码给定字符串。Encodes the given string in base64.

  • fromBase64('Z3VuY2h1cw==') -> 'gunchus'

fromUTC

fromUTC(<value1> : timestamp, [<value2> : string]) => timestamp

转换为 UTC 格式的时间戳。Converts to the timestamp from UTC. 可以选择以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递时区。You can optionally pass the timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 默认为当前时区。It is defaulted to the current timezone. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • fromUTC(currentTimestamp()) == toTimestamp('2050-12-12 19:18:12') -> false
  • fromUTC(currentTimestamp(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true

greater

greater(<value1> : any, <value2> : any) => boolean

“大于”比较运算符。Comparison greater operator. 与 > 运算符相同。Same as > operator.

  • greater(12, 24) -> false
  • ('dumbo' > 'dum') -> true
  • (toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS') > toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> true

greaterOrEqual

greaterOrEqual(<value1> : any, <value2> : any) => boolean

“大于等于”比较运算符。Comparison greater than or equal operator. 与 >= 运算符相同。Same as >= operator.

  • greaterOrEqual(12, 12) -> true
  • ('dumbo' >= 'dum') -> true

greatest

greatest(<value1> : any, ...) => any

返回输入值列表中的最大值,跳过 NULL 值。Returns the greatest value among the list of values as input skipping null values. 如果所有输入均为 NULL,则返回 NULL。Returns null if all inputs are null.

  • greatest(10, 30, 15, 20) -> 30
  • greatest(10, toInteger(null), 20) -> 20
  • greatest(toDate('2010-12-12'), toDate('2011-12-12'), toDate('2000-12-12')) -> toDate('2011-12-12')
  • greatest(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS'), toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) -> toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')

hasColumn

hasColumn(<column name> : string, [<stream name> : string]) => boolean

在流中按名称检查列值。Checks for a column value by name in the stream. 可以将可选流名称作为第二个参数传递。You can pass a optional stream name as the second argument. 设计时已知的列名应该只按名称进行寻址。Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • hasColumn('parent')

hour

hour(<value1> : timestamp, [<value2> : string]) => integer

获取时间戳的小时值。Gets the hour value of a timestamp. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • hour(toTimestamp('2009-07-30 12:58:59')) -> 12
  • hour(toTimestamp('2009-07-30 12:58:59'), 'PST') -> 12

hours

hours(<value1> : integer) => long

小时数的持续时间(以毫秒为单位)。Duration in milliseconds for number of hours.

  • hours(2) -> 7200000L

iif

iif(<condition> : boolean, <true_expression> : any, [<false_expression> : any]) => any

根据条件应用一个值或另一个值。Based on a condition applies one value or the other. 如果未指定另一个值,则将它视为 NULL。If other is unspecified it is considered NULL. 这两个值必须兼容(数字、字符串…)。* iif(10 + 20 == 30, 'dumbo', 'gumbo') -> 'dumbo'Both the values must be compatible(numeric, string...). * iif(10 + 20 == 30, 'dumbo', 'gumbo') -> 'dumbo'

  • iif(10 > 30, 'dumbo', 'gumbo') -> 'gumbo'
  • iif(month(toDate('2018-12-01')) == 12, 345.12, 102.67) -> 345.12

iifNull

iifNull(<value1> : any, [<value2> : any], ...) => any

检查第一个参数是否为 NULL。Checks if the first parameter is null. 如果不为 NULL,则返回第一个参数。If not null, the first parameter is returned. 如果为 NULL,则返回第二个参数。If null, the second parameter is returned. 如果指定了三个参数,则行为与 iif(isNull(value1), value2, value3) 相同,如果第一个值不为 NULL,则返回第三个参数。If three parameters are specified, the behavior is the same as iif(isNull(value1), value2, value3) and the third parameter is returned if the first value is not null.

  • iifNull(10, 20) -> 10
  • iifNull(null, 20, 40) -> 20
  • iifNull('azure', 'data', 'factory') -> 'factory'
  • iifNull(null, 'data', 'factory') -> 'data'

initCap

initCap(<value1> : string) => string

将每个单词的第一个字母转换为大写。Converts the first letter of every word to uppercase. 通过空格分隔符识别单词。Words are identified as separated by whitespace.

  • initCap('cool iceCREAM') -> 'Cool Icecream'

instr

instr(<string> : string, <substring to find> : string) => integer

在字符串中查找子字符串的位置(从 1 开始)。Finds the position(1 based) of the substring within a string. 如果未找到,则返回 0。0 is returned if not found.

  • instr('dumbo', 'mbo') -> 3
  • instr('microsoft', 'o') -> 5
  • instr('good', 'bad') -> 0

isDelete

isDelete([<value1> : integer]) => boolean

检查该行是否标记为删除。Checks if the row is marked for delete. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isDelete()
  • isDelete(1)

isError

isError([<value1> : integer]) => boolean

检查该行是否标记为错误。Checks if the row is marked as error. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isError()
  • isError(1)

isIgnore

isIgnore([<value1> : integer]) => boolean

检查该行是否标记为忽略。Checks if the row is marked to be ignored. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isIgnore()
  • isIgnore(1)

isInsert

isInsert([<value1> : integer]) => boolean

检查该行是否标记为插入。Checks if the row is marked for insert. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isInsert()
  • isInsert(1)

isMatch

isMatch([<value1> : integer]) => boolean

检查查找时是否匹配该行。Checks if the row is matched at lookup. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isMatch()
  • isMatch(1)

isNull

isNull(<value1> : any) => boolean

检查值是否为 NULL。Checks if the value is NULL.

  • isNull(NULL()) -> true
  • isNull('') -> false

isUpdate

isUpdate([<value1> : integer]) => boolean

检查该行是否标记为更新。Checks if the row is marked for update. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isUpdate()
  • isUpdate(1)

isUpsert

isUpsert([<value1> : integer]) => boolean

检查该行是否标记为插入。Checks if the row is marked for insert. 对于采用多个输入流的转换,可以传递流的(从 1 开始)索引。For transformations taking more than one input stream you can pass the (1-based) index of the stream. 流索引应为 1 或 2,默认值为 1。The stream index should be either 1 or 2 and the default value is 1.

  • isUpsert()
  • isUpsert(1)

lastDayOfMonth

lastDayOfMonth(<value1> : datetime) => date

获取给定日期的最后一个月份日期。Gets the last date of the month given a date.

  • lastDayOfMonth(toDate('2009-01-12')) -> toDate('2009-01-31')

least

least(<value1> : any, ...) => any

“小于等于”比较运算符。Comparison lesser than or equal operator. 与 <= 运算符相同。Same as <= operator.

  • least(10, 30, 15, 20) -> 10
  • least(toDate('2010-12-12'), toDate('2011-12-12'), toDate('2000-12-12')) -> toDate('2000-12-12')

left

left(<string to subset> : string, <number of characters> : integral) => string

从索引 1 处开始提取包含字符数的子字符串。Extracts a substring start at index 1 with number of characters. 与 SUBSTRING(str, 1, n) 相同。Same as SUBSTRING(str, 1, n).

  • left('bojjus', 2) -> 'bo'
  • left('bojjus', 20) -> 'bojjus'

length

length(<value1> : string) => integer

返回字符串的长度。Returns the length of the string.

  • length('dumbo') -> 5

lesser

lesser(<value1> : any, <value2> : any) => boolean

“小于”比较运算符。Comparison less operator. 与 < 运算符相同。Same as < operator.

  • lesser(12, 24) -> true
  • ('abcd' < 'abc') -> false
  • (toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') < toTimestamp('2019-02-05 08:21:34.890', 'yyyy-MM-dd HH:mm:ss.SSS')) -> true

lesserOrEqual

lesserOrEqual(<value1> : any, <value2> : any) => boolean

“小于等于”比较运算符。Comparison lesser than or equal operator. 与 <= 运算符相同。Same as <= operator.

  • lesserOrEqual(12, 12) -> true
  • ('dumbo' <= 'dum') -> false

levenshtein

levenshtein(<from string> : string, <to string> : string) => integer

获取两个字符串之间的 levenshtein 距离。Gets the levenshtein distance between two strings.

  • levenshtein('boys', 'girls') -> 4

like

like(<string> : string, <pattern match> : string) => boolean

模式是按原义匹配的字符串。The pattern is a string that is matched literally. 以下特殊符号除外:_ 与输入中的任何一个字符匹配(类似于The exceptions are the following special symbols: _ matches any one character in the input (similar to . posix 正则表达式中的 .)。% 匹配输入中的零个或多个字符(类似于 posix 正则表达式中的 .*)。in posix regular expressions) % matches zero or more characters in the input (similar to .* in posix regular expressions). 转义字符为 ''。The escape character is ''. 如果转义字符的之前带有特殊符号或其他转义字符,则在字面上匹配后面的字符。If an escape character precedes a special symbol or another escape character, the following character is matched literally. 转义其他任何字符的操作无效。It is invalid to escape any other character.

  • like('icecream', 'ice%') -> true

locate

locate(<substring to find> : string, <string> : string, [<from index - 1-based> : integral]) => integer

从特定的位置开始,在字符串中查找子字符串的位置(从 1 开始)。Finds the position(1 based) of the substring within a string starting a certain position. 如果省略位置,则视为从字符串开头查找。If the position is omitted it is considered from the beginning of the string. 如果未找到,则返回 0。0 is returned if not found.

  • locate('mbo', 'dumbo') -> 3
  • locate('o', 'microsoft', 6) -> 7
  • locate('bad', 'good') -> 0

log

log(<value1> : number, [<value2> : number]) => double

计算对数值。Calculates log value. 可以提供可选的底,否则使用欧拉数。An optional base can be supplied else a Euler number if used.

  • log(100, 10) -> 2

log10

log10(<value1> : number) => double

以 10 为底计算对数值。Calculates log value based on 10 base.

  • log10(100) -> 2

lower

lower(<value1> : string) => string

将字符串小写。Lowercases a string.

  • lower('GunChus') -> 'gunchus'

lpad

lpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string

在左侧使用提供的填充内容填充字符串,直到达到特定的长度。Left pads the string by the supplied padding until it is of a certain length. 如果字符串等于或大于特定长度,则进行剪裁。If the string is equal to or greater than the length, then it is trimmed to the length.

  • lpad('dumbo', 10, '-') -> '-----dumbo'
  • lpad('dumbo', 4, '-') -> 'dumb'
    * lpad('dumbo', 8, '<>') -> '<><dumbo'

ltrim

ltrim(<string to trim> : string, [<trim characters> : string]) => string* lpad('dumbo', 8, '<>') -> '<><dumbo'


ltrim

ltrim(<string to trim> : string, [<trim characters> : string]) => string

在左侧裁剪掉前导和尾随字符的字符串。Left trims a string of leading characters. 如果未指定第二个参数,则裁剪掉空格。If second parameter is unspecified, it trims whitespace. 否则,剪裁掉第二个参数中指定的任何字符。Else it trims any character specified in the second parameter.

  • ltrim(' dumbo ') -> 'dumbo '
  • ltrim('!--!du!mbo!', '-!') -> 'du!mbo!'

md5

md5(<value1> : any, ...) => string

计算不同基元数据类型的列集的 MD5 摘要,并返回 32 字符十六进制字符串。Calculates the MD5 digest of set of column of varying primitive datatypes and returns a 32 character hex string. 它可用于计算行的指纹。It can be used to calculate a fingerprint for a row.

  • md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '4ce8a880bd621a1ffad0bca905e1bc5a'

millisecond

millisecond(<value1> : timestamp, [<value2> : string]) => integer

获取日期的毫秒值。Gets the millisecond value of a date. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • millisecond(toTimestamp('2009-07-30 12:58:59.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> 871

milliseconds

milliseconds(<value1> : integer) => long

毫秒数的持续时间(以毫秒为单位)。Duration in milliseconds for number of milliseconds.

  • milliseconds(2) -> 2L

minus

minus(<value1> : any, <value2> : any) => any

减去数字。Subtracts numbers. 从日期数中减去日期。Subtract number of days from a date. 从时间戳中减去持续时间。Subtract duration from a timestamp. 减去两个时间戳以获取差异(以毫秒为单位)。Subtract two timestamps to get difference in milliseconds. 与 - 运算符相同。Same as the - operator.

  • minus(20, 10) -> 10
  • 20 - 10 -> 10
  • minus(toDate('2012-12-15'), 3) -> toDate('2012-12-12')
  • toDate('2012-12-15') - 3 -> toDate('2012-12-12')
  • toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS') + (days(1) + hours(2) - seconds(10)) -> toTimestamp('2019-02-04 07:19:18.871', 'yyyy-MM-dd HH:mm:ss.SSS')
  • toTimestamp('2019-02-03 05:21:34.851', 'yyyy-MM-dd HH:mm:ss.SSS') - toTimestamp('2019-02-03 05:21:36.923', 'yyyy-MM-dd HH:mm:ss.SSS') -> -2072

minute

minute(<value1> : timestamp, [<value2> : string]) => integer

获取时间戳的分钟值。Gets the minute value of a timestamp. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • minute(toTimestamp('2009-07-30 12:58:59')) -> 58
  • minute(toTimestamp('2009-07-30 12:58:59'), 'PST') -> 58

minutes

minutes(<value1> : integer) => long

分钟数的持续时间(以毫秒为单位)。Duration in milliseconds for number of minutes.

  • minutes(2) -> 120000L

mod

mod(<value1> : any, <value2> : any) => any

将数字对取模。Modulus of pair of numbers. 与 % 运算符相同。Same as the % operator.

  • mod(20, 8) -> 4
  • 20 % 8 -> 4

month

month(<value1> : datetime) => integer

获取日期或时间戳的月份值。Gets the month value of a date or timestamp.

  • month(toDate('2012-8-8')) -> 8

monthsBetween

monthsBetween(<from date/timestamp> : datetime, <to date/timestamp> : datetime, [<roundoff> : boolean], [<time zone> : string]) => double

获取两个日期之间的月数。Gets the number of months between two dates. 可以舍入计算。可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can round off the calculation.You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • monthsBetween(toTimestamp('1997-02-28 10:30:00'), toDate('1996-10-30')) -> 3.94959677

multiply

multiply(<value1> : any, <value2> : any) => any

将数字对相乘。Multiplies pair of numbers. 与 * 运算符相同。Same as the * operator.

  • multiply(20, 10) -> 200
  • 20 * 10 -> 200

negate

negate(<value1> : number) => number

对数字求反。Negates a number. 将正数转换为负数,或反之。Turns positive numbers to negative and vice versa.

  • negate(13) -> -13

nextSequence

nextSequence() => long

返回下一个唯一序列。Returns the next unique sequence. 该数字仅在分区中是连续的,带有 partitionId 前缀。The number is consecutive only within a partition and is prefixed by the partitionId.

  • nextSequence() == 12313112 -> false

normalize

normalize(<String to normalize> : string) => string

规范化字符串值以分隔重音的 unicode 字符。Normalizes the string value to separate accented unicode characters.

  • regexReplace(normalize('bo²s'), `\p{M}`, '') -> 'boys'

not

not(<value1> : boolean) => boolean

逻辑求反运算符。Logical negation operator.

  • not(true) -> false
  • not(10 == 20) -> true

notEquals

notEquals(<value1> : any, <value2> : any) => boolean

“不等于”比较运算符。Comparison not equals operator. 与 != 运算符相同。Same as != operator.

  • 12 != 24 -> true
  • 'bojjus' != 'bo' + 'jjus' -> false

notNull

notNull(<value1> : any) => boolean

检查值是否不为 NULL。Checks if the value is not NULL.

  • notNull(NULL()) -> false
  • notNull('') -> true

null

null() => null

返回 NULL 值。Returns a NULL value. 如果存在名为“null”的列,则使用函数 syntax(null())Use the function syntax(null()) if there is a column named 'null'. 使用该函数的任何操作将导致 NULL。Any operation that uses will result in a NULL.

  • isNull('dumbo' + null) -> true
  • isNull(10 * null) -> true
  • isNull('') -> false
  • isNull(10 + 20) -> false
  • isNull(10/0) -> true

or

or(<value1> : boolean, <value2> : boolean) => boolean

“逻辑或”运算符。Logical OR operator. 与 || 相同。Same as ||.

  • or(true, false) -> true
  • true || false -> true

pMod

pMod(<value1> : any, <value2> : any) => any

将数字对正数取模。Positive Modulus of pair of numbers.

  • pmod(-20, 8) -> 4

partitionId

partitionId() => integer

返回输入行所在的当前分区 ID。Returns the current partition id the input row is in.

  • partitionId()

power

power(<value1> : number, <value2> : number) => double

以一个数为底、另一数为幂求值。Raises one number to the power of another.

  • power(10, 2) -> 100

random

random(<value1> : integral) => long

返回给定分区内的可选种子的随机数。Returns a random number given an optional seed within a partition. 种子应为固定值,与 partitionId 一起用于生成随机值 * random(1) == 1 -> false


regexExtract

regexExtract(<string> : string, <regex to find> : string, [<match group 1-based index> : integral]) => stringThe seed should be a fixed value and is used in conjunction with the partitionId to produce random values * random(1) == 1 -> false


regexExtract

regexExtract(<string> : string, <regex to find> : string, [<match group 1-based index> : integral]) => string

提取给定正则表达式模式的匹配子字符串。Extract a matching substring for a given regex pattern. 最后一个参数标识匹配组,如果省略,则默认为 1。The last parameter identifies the match group and is defaulted to 1 if omitted. 使用 <regex>(反引号)匹配字符串且不转义。Use <regex>(back quote) to match a string without escaping.

  • regexExtract('Cost is between 600 and 800 dollars', '(\\d+) and (\\d+)', 2) -> '800'
  • regexExtract('Cost is between 600 and 800 dollars', `(\d+) and (\d+)`, 2) -> '800'

regexMatch

regexMatch(<string> : string, <regex to match> : string) => boolean

检查字符串是否与给定的正则表达式模式相匹配。Checks if the string matches the given regex pattern. 使用 <regex>(反引号)匹配字符串且不转义。Use <regex>(back quote) to match a string without escaping.

  • regexMatch('200.50', '(\\d+).(\\d+)') -> true
  • regexMatch('200.50', `(\d+).(\d+)`) -> true

regexReplace

regexReplace(<string> : string, <regex to find> : string, <substring to replace> : string) => string

将给定字符串中出现的一个正则表达式模式全部替换为另一个子字符串。使用 <regex>(反引号)匹配字符串且不转义。Replace all occurrences of a regex pattern with another substring in the given string Use <regex>(back quote) to match a string without escaping.

  • regexReplace('100 and 200', '(\\d+)', 'bojjus') -> 'bojjus and bojjus'
  • regexReplace('100 and 200', `(\d+)`, 'gunchus') -> 'gunchus and gunchus'

regexSplit

regexSplit(<string to split> : string, <regex expression> : string) => array

基于分隔符和正则表达式拆分字符串,并返回字符串数组。Splits a string based on a delimiter based on regex and returns an array of strings.

  • regexSplit('bojjusAgunchusBdumbo', `[CAB]`) -> ['bojjus', 'gunchus', 'dumbo']
  • regexSplit('bojjusAgunchusBdumboC', `[CAB]`) -> ['bojjus', 'gunchus', 'dumbo', '']
  • (regexSplit('bojjusAgunchusBdumboC', `[CAB]`)[1]) -> 'bojjus'
  • isNull(regexSplit('bojjusAgunchusBdumboC', `[CAB]`)[20]) -> true

replace

replace(<string> : string, <substring to find> : string, [<substring to replace> : string]) => string

将给定字符串中出现的一个子字符串全部替换为另一个子字符串。Replace all occurrences of a substring with another substring in the given string. 如果省略最后一个参数,则默认为空字符串。If the last parameter is omitted, it is default to empty string.

  • replace('doggie dog', 'dog', 'cat') -> 'catgie cat'
  • replace('doggie dog', 'dog', '') -> 'gie '
  • replace('doggie dog', 'dog') -> 'gie '

reverse

reverse(<value1> : string) => string

反转字符串。Reverses a string.

  • reverse('gunchus') -> 'suhcnug'

right(<string to subset> : string, <number of characters> : integral) => string

从右侧提取包含字符数的子字符串。Extracts a substring with number of characters from the right. 与 SUBSTRING(str, LENGTH(str) - n, n) 相同。Same as SUBSTRING(str, LENGTH(str) - n, n).

  • right('bojjus', 2) -> 'us'
  • right('bojjus', 20) -> 'bojjus'

rlike

rlike(<string> : string, <pattern match> : string) => boolean

检查字符串是否与给定的正则表达式模式相匹配。Checks if the string matches the given regex pattern.

  • rlike('200.50', `(\d+).(\d+)`) -> true
  • rlike('bogus', `M[0-9]+.*`) -> false

round

round(<number> : number, [<scale to round> : number], [<rounding option> : integral]) => double

根据可选的小数位数和可选的舍入模式将数字四舍五入。Rounds a number given an optional scale and an optional rounding mode. 如果省略小数位数,则默认为 0。If the scale is omitted, it is defaulted to 0. 如果省略模式,则默认为 ROUND_HALF_UP(5)。If the mode is omitted, it is defaulted to ROUND_HALF_UP(5). 舍入运算的值包括:1 - ROUND_UP;2 - ROUND_DOWN;3 - ROUND_CEILING;4 - ROUND_FLOOR;5 - ROUND_HALF_UP;6 - ROUND_HALF_DOWN;7 - ROUND_HALF_EVEN;8 - ROUND_UNNECESSARY。The values for rounding include 1 - ROUND_UP 2 - ROUND_DOWN 3 - ROUND_CEILING 4 - ROUND_FLOOR 5 - ROUND_HALF_UP 6 - ROUND_HALF_DOWN 7 - ROUND_HALF_EVEN 8 - ROUND_UNNECESSARY.

  • round(100.123) -> 100.0
  • round(2.5, 0) -> 3.0
  • round(5.3999999999999995, 2, 7) -> 5.40

rpad

rpad(<string to pad> : string, <final padded length> : integral, <padding> : string) => string

在右侧使用提供的填充内容填充字符串,直到达到特定的长度。Right pads the string by the supplied padding until it is of a certain length. 如果字符串等于或大于特定长度,则进行剪裁。If the string is equal to or greater than the length, then it is trimmed to the length.

  • rpad('dumbo', 10, '-') -> 'dumbo-----'
  • rpad('dumbo', 4, '-') -> 'dumb'
  • rpad('dumbo', 8, '<>') -> 'dumbo<><'
    ___

rtrimrtrim

rtrim(<string to trim> : string, [<trim characters> : string]) => string___

rtrimrtrim

rtrim(<string to trim> : string, [<trim characters> : string]) => string

在右侧裁剪掉尾随字符的字符串。Right trims a string of trailing characters. 如果未指定第二个参数,则裁剪掉空格。If second parameter is unspecified, it trims whitespace. 否则,剪裁掉第二个参数中指定的任何字符。Else it trims any character specified in the second parameter.

  • rtrim(' dumbo ') -> ' dumbo'
  • rtrim('!--!du!mbo!', '-!') -> '!--!du!mbo'

second

second(<value1> : timestamp, [<value2> : string]) => integer

获取日期的秒值。Gets the second value of a date. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 当地时区用作默认值。The local timezone is used as the default. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • second(toTimestamp('2009-07-30 12:58:59')) -> 59

seconds

seconds(<value1> : integer) => long

秒数的持续时间(以毫秒为单位)。Duration in milliseconds for number of seconds.

  • seconds(2) -> 2000L

sha1

sha1(<value1> : any, ...) => string

计算不同基元数据类型的列集的 SHA-1 摘要,并返回 40 字符十六进制字符串。Calculates the SHA-1 digest of set of column of varying primitive datatypes and returns a 40 character hex string. 它可用于计算行的指纹。It can be used to calculate a fingerprint for a row.

  • sha1(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> '46d3b478e8ec4e1f3b453ac3d8e59d5854e282bb'

sha2

sha2(<value1> : integer, <value2> : any, ...) => string

根据位长度计算不同基元数据类型的列集的 SHA-2 摘要,值只能为 0(256)、224、256、384、512。Calculates the SHA-2 digest of set of column of varying primitive datatypes given a bit length which can only be of values 0(256), 224, 256, 384, 512. 它可用于计算行的指纹。It can be used to calculate a fingerprint for a row.

  • sha2(256, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4')) -> 'afe8a553b1761c67d76f8c31ceef7f71b66a1ee6f4e6d3b5478bf68b47d06bd3'

sin

sin(<value1> : number) => double

计算正弦值。Calculates a sine value.

  • sin(2) -> 0.9092974268256817

sinh

sinh(<value1> : number) => double

计算双曲正弦值。Calculates a hyperbolic sine value.

  • sinh(0) -> 0.0

soundex

soundex(<value1> : string) => string

获取字符串的 soundex 代码。Gets the soundex code for the string.

  • soundex('genius') -> 'G520'

split

split(<string to split> : string, <split characters> : string) => array

基于分隔符拆分字符串,并返回字符串数组。Splits a string based on a delimiter and returns an array of strings.

  • split('bojjus,guchus,dumbo', ',') -> ['bojjus', 'guchus', 'dumbo']
  • split('bojjus,guchus,dumbo', '|') -> ['bojjus,guchus,dumbo']
  • split('bojjus, guchus, dumbo', ', ') -> ['bojjus', 'guchus', 'dumbo']
  • split('bojjus, guchus, dumbo', ', ')[1] -> 'bojjus'
  • isNull(split('bojjus, guchus, dumbo', ', ')[0]) -> true
  • isNull(split('bojjus, guchus, dumbo', ', ')[20]) -> true
  • split('bojjusguchusdumbo', ',') -> ['bojjusguchusdumbo']

sqrt

sqrt(<value1> : number) => double

计算一个数的平方根。Calculates the square root of a number.

  • sqrt(9) -> 3

startsWith

startsWith(<string> : string, <substring to check> : string) => boolean

检查字符串是否以提供的字符串开头。Checks if the string starts with the supplied string.

  • startsWith('dumbo', 'du') -> true

subDays

subDays(<date/timestamp> : datetime, <days to subtract> : integral) => datetime

从日期或时间戳中减去天数。Subtract days from a date or timestamp. 与日期的 - 运算符相同。Same as the - operator for date.

  • subDays(toDate('2016-08-08'), 1) -> toDate('2016-08-07')

subMonths

subMonths(<date/timestamp> : datetime, <months to subtract> : integral) => datetime

从日期或时间戳中减去月份。Subtract months from a date or timestamp.

  • subMonths(toDate('2016-09-30'), 1) -> toDate('2016-08-31')

substring

substring(<string to subset> : string, <from 1-based index> : integral, [<number of characters> : integral]) => string

从某个位置提取特定长度的子字符串。Extracts a substring of a certain length from a position. 位置从 1 开始。Position is 1 based. 如果省略长度,则默认为字符串的末尾。If the length is omitted, it is defaulted to end of the string.

  • substring('Cat in the hat', 5, 2) -> 'in'
  • substring('Cat in the hat', 5, 100) -> 'in the hat'
  • substring('Cat in the hat', 5) -> 'in the hat'
  • substring('Cat in the hat', 100, 100) -> ''

tan

tan(<value1> : number) => double

计算正切值。Calculates a tangent value.

  • tan(0) -> 0.0

tanh

tanh(<value1> : number) => double

计算双曲正切值。Calculates a hyperbolic tangent value.

  • tanh(0) -> 0.0

translate

translate(<string to translate> : string, <lookup characters> : string, <replace characters> : string) => string

将字符串中的一组字符替换为另一组字符。Replace one set of characters by another set of characters in the string. 对字符进行 1 对 1 的替换.Characters have 1 to 1 replacement.

  • translate('(bojjus)', '()', '[]') -> '[bojjus]'
  • translate('(gunchus)', '()', '[') -> '[gunchus'

trim

trim(<string to trim> : string, [<trim characters> : string]) => string

裁剪掉前导和尾随字符的字符串。Trims a string of leading and trailing characters. 如果未指定第二个参数,则裁剪掉空格。If second parameter is unspecified, it trims whitespace. 否则,剪裁掉第二个参数中指定的任何字符。Else it trims any character specified in the second parameter.

  • trim(' dumbo ') -> 'dumbo'
  • trim('!--!du!mbo!', '-!') -> 'du!mbo'

true

true() => boolean

始终返回 true 值。Always returns a true value. 如果存在名为“true”的列,则使用函数 syntax(true())Use the function syntax(true()) if there is a column named 'true'.

  • (10 + 20 == 30) -> true
  • (10 + 20 == 30) -> true()

typeMatch

typeMatch(<type> : string, <base type> : string) => boolean

匹配列的类型。Matches the type of the column. 只可在模式表达式中使用。number 匹配短整型数、整数、长整型数、双精度数、浮点数或小数,integral 匹配短整型数、整数、长整型数,fractional 匹配双精度数、浮点数、小数,datetime 匹配日期或时间戳类型。Can only be used in pattern expressions.number matches short, integer, long, double, float or decimal, integral matches short, integer, long, fractional matches double, float, decimal and datetime matches date or timestamp type.

  • typeMatch(type, 'number')
  • typeMatch('date', 'datetime')

unescape

unescape(<string_to_escape> : string, <format> : string) => string

根据格式取消转义字符串。Unescapes a string according to a format. 可接受的格式的文本值为 "json"、"xml"、"ecmascript"、"html"、"java"。Literal values for acceptable format are 'json', 'xml', 'ecmascript', 'html', 'java'.

  • unescape('{\\\\\"value\\\\\": 10}', 'json')
  • '{\\\"value\\\": 10}'

upper

upper(<value1> : string) => string

将字符串大写。Uppercases a string.

  • upper('bojjus') -> 'BOJJUS'

uuid

uuid() => string

返回生成的 UUID。Returns the generated UUID.

  • uuid()

weekOfYear

weekOfYear(<value1> : datetime) => integer

获取给定日期的年份周次。Gets the week of the year given a date.

  • weekOfYear(toDate('2008-02-20')) -> 8

weeks

weeks(<value1> : integer) => long

周数的持续时间(以毫秒为单位)。Duration in milliseconds for number of weeks.

  • weeks(2) -> 1209600000L

xor

xor(<value1> : boolean, <value2> : boolean) => boolean

逻辑 XOR 运算符。Logical XOR operator. 与 ^ 运算符相同。Same as ^ operator.

  • xor(true, false) -> true
  • xor(true, true) -> false
  • true ^ false -> true

year

year(<value1> : datetime) => integer

获取日期的年份值。Gets the year value of a date.

  • year(toDate('2012-8-8')) -> 2012

聚合函数Aggregate functions

以下函数仅可用于聚合、透视、逆透视和窗口转换。The following functions are only available in aggregate, pivot, unpivot, and window transformations.


avg

avg(<value1> : number) => number

获取列值的平均值。Gets the average of values of a column.

  • avg(sales)

avgIf

avgIf(<value1> : boolean, <value2> : number) => number

根据条件获取列值的平均值。Based on a criteria gets the average of values of a column.

  • avgIf(region == 'West', sales)

collect

collect(<value1> : any) => array

将聚合组中表达式的所有值收集到一个数组中。Collects all values of the expression in the aggregated group into an array. 在此过程中,可以收集结构并将其转换为备用结构。Structures can be collected and transformed to alternate structures during this process. 项目数将等于该组中的行数,并且可以包含 NULL 值。The number of items will be equal to the number of rows in that group and can contain null values. 收集的项目数应较小。The number of collected items should be small.

  • collect(salesPerson)
  • collect(firstName + lastName))
  • collect(@(name = salesPerson, sales = salesAmount) )

count

count([<value1> : any]) => long

获取值的聚合计数。Gets the aggregate count of values. 如果指定了可选的列,则忽略计数中的 NULL 值。If the optional column(s) is specified, it ignores NULL values in the count.

  • count(custId)
  • count(custId, custName)
  • count()
  • count(iif(isNull(custId), 1, NULL))

countDistinct

countDistinct(<value1> : any, [<value2> : any], ...) => long

获取列集的非重复值的聚合计数。Gets the aggregate count of distinct values of a set of columns.

  • countDistinct(custId, custName)

countIf

countIf(<value1> : boolean, [<value2> : any]) => long

根据条件获取值的聚合计数。Based on a criteria gets the aggregate count of values. 如果指定了可选的列,则忽略计数中的 NULL 值。If the optional column is specified, it ignores NULL values in the count.

  • countIf(state == 'CA' && commission < 10000, name)

covariancePopulation

covariancePopulation(<value1> : number, <value2> : number) => double

获取两个列之间的总体协方差。Gets the population covariance between two columns.

  • covariancePopulation(sales, profit)

covariancePopulationIf

covariancePopulationIf(<value1> : boolean, <value2> : number, <value3> : number) => double

根据条件获取两个列的总体协方差。Based on a criteria, gets the population covariance of two columns.

  • covariancePopulationIf(region == 'West', sales)

covarianceSample

covarianceSample(<value1> : number, <value2> : number) => double

获取两个列的样本协方差。Gets the sample covariance of two columns.

  • covarianceSample(sales, profit)

covarianceSampleIf

covarianceSampleIf(<value1> : boolean, <value2> : number, <value3> : number) => double

根据条件获取两个列的样本协方差。Based on a criteria, gets the sample covariance of two columns.

  • covarianceSampleIf(region == 'West', sales, profit)

first

first(<value1> : any, [<value2> : boolean]) => any

获取列组的第一个值。Gets the first value of a column group. 如果省略第二个参数 ignoreNulls,则假定为 false。If the second parameter ignoreNulls is omitted, it is assumed false.

  • first(sales)
  • first(sales, false)

isDistinct

isDistinct(<value1> : any , <value1> : any) => boolean

确定一列或一组列是否是非重复值。Finds if a column or set of columns is distinct. 它不将 null 计为非重复值 * isDistinct(custId, custName) => boolean * ___

kurtosis

kurtosis(<value1> : number) => doubleIt does not count null as a distinct value * isDistinct(custId, custName) => boolean * ___

kurtosis

kurtosis(<value1> : number) => double

获取列的峰度。Gets the kurtosis of a column.

  • kurtosis(sales)

kurtosisIf

kurtosisIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的峰度。Based on a criteria, gets the kurtosis of a column.

  • kurtosisIf(region == 'West', sales)

last

last(<value1> : any, [<value2> : boolean]) => any

获取列组的最后一个值。Gets the last value of a column group. 如果省略第二个参数 ignoreNulls,则假定为 false。If the second parameter ignoreNulls is omitted, it is assumed false.

  • last(sales)
  • last(sales, false)

max

max(<value1> : any) => any

获取列的最大值。Gets the maximum value of a column.

  • max(sales)

maxIf

maxIf(<value1> : boolean, <value2> : any) => any

根据条件获取列的最大值。Based on a criteria, gets the maximum value of a column.

  • maxIf(region == 'West', sales)

mean

mean(<value1> : number) => number

获取列值的中间值。Gets the mean of values of a column. 与 AVG 相同。Same as AVG.

  • mean(sales)

meanIf

meanIf(<value1> : boolean, <value2> : number) => number

根据条件获取列值的中间值。Based on a criteria gets the mean of values of a column. 与 avgIf 相同。Same as avgIf.

  • meanIf(region == 'West', sales)

min

min(<value1> : any) => any

获取列的最小值。Gets the minimum value of a column.

  • min(sales)

minIf

minIf(<value1> : boolean, <value2> : any) => any

根据条件获取列的最小值。Based on a criteria, gets the minimum value of a column.

  • minIf(region == 'West', sales)

skewness

skewness(<value1> : number) => double

获取列的偏度。Gets the skewness of a column.

  • skewness(sales)

skewnessIf

skewnessIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的偏度。Based on a criteria, gets the skewness of a column.

  • skewnessIf(region == 'West', sales)

stddev

stddev(<value1> : number) => double

获取列的标准偏差。Gets the standard deviation of a column.

  • stdDev(sales)

stddevIf

stddevIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的标准偏差。Based on a criteria, gets the standard deviation of a column.

  • stddevIf(region == 'West', sales)

stddevPopulation

stddevPopulation(<value1> : number) => double

获取列的总体标准偏差。Gets the population standard deviation of a column.

  • stddevPopulation(sales)

stddevPopulationIf

stddevPopulationIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的总体标准偏差。Based on a criteria, gets the population standard deviation of a column.

  • stddevPopulationIf(region == 'West', sales)

stddevSample

stddevSample(<value1> : number) => double

获取列的样本标准偏差。Gets the sample standard deviation of a column.

  • stddevSample(sales)

stddevSampleIf

stddevSampleIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的样本标准偏差。Based on a criteria, gets the sample standard deviation of a column.

  • stddevSampleIf(region == 'West', sales)

sum

sum(<value1> : number) => number

获取数字列的聚合总数。Gets the aggregate sum of a numeric column.

  • sum(col)

sumDistinct

sumDistinct(<value1> : number) => number

获取数字列的非重复值的聚合总数。Gets the aggregate sum of distinct values of a numeric column.

  • sumDistinct(col)

sumDistinctIf

sumDistinctIf(<value1> : boolean, <value2> : number) => number

根据条件获取数字列的聚合总数。Based on criteria gets the aggregate sum of a numeric column. 条件可以基于任何列。The condition can be based on any column.

  • sumDistinctIf(state == 'CA' && commission < 10000, sales)
  • sumDistinctIf(true, sales)

sumIf

sumIf(<value1> : boolean, <value2> : number) => number

根据条件获取数字列的聚合总数。Based on criteria gets the aggregate sum of a numeric column. 条件可以基于任何列。The condition can be based on any column.

  • sumIf(state == 'CA' && commission < 10000, sales)
  • sumIf(true, sales)

variance

variance(<value1> : number) => double

获取列的方差。Gets the variance of a column.

  • variance(sales)

varianceIf

varianceIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的方差。Based on a criteria, gets the variance of a column.

  • varianceIf(region == 'West', sales)

variancePopulation

variancePopulation(<value1> : number) => double

获取列的总体方差。Gets the population variance of a column.

  • variancePopulation(sales)

variancePopulationIf

variancePopulationIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的总体方差。Based on a criteria, gets the population variance of a column.

  • variancePopulationIf(region == 'West', sales)

varianceSample

varianceSample(<value1> : number) => double

获取列的无偏方差。Gets the unbiased variance of a column.

  • varianceSample(sales)

varianceSampleIf

varianceSampleIf(<value1> : boolean, <value2> : number) => double

根据条件获取列的无偏方差。Based on a criteria, gets the unbiased variance of a column.

  • varianceSampleIf(region == 'West', sales)

数组函数Array functions

数组函数对属于数组的数据结构执行转换。Array functions perform transformations on data structures that are arrays. 其中包括用于对数组元素和索引进行寻址的特殊关键字:These include special keywords to address array elements and indexes:

  • #acc 表示在减少数组时希望包含在单个输出中的值#acc represents a value that you wish to include in your single output when reducing an array
  • #index 表示当前的数组索引以及数组索引编号 #index2, #index3 ...#index represents the current array index, along with array index numbers #index2, #index3 ...
  • #item 表示数组中的当前元素值#item represents the current element value in the array

array

array([<value1> : any], ...) => array

创建项的数组。Creates an array of items. 所有项应属于同一类型。All items should be of the same type. 如果未指定任何项,则默认值为空字符串数组。If no items are specified, an empty string array is the default. 与 [] 创建运算符相同。Same as a [] creation operator.

  • array('Seattle', 'Washington')
  • ['Seattle', 'Washington']
  • ['Seattle', 'Washington'][1]
  • 'Washington'

contains

contains(<value1> : array, <value2> : unaryfunction) => boolean

如果所提供的数组中的任何元素在提供的谓词中计算结果为 true,则返回 true。Returns true if any element in the provided array evaluates as true in the provided predicate. Contains 需要引用谓词函数中的一个元素作为 #item。Contains expects a reference to one element in the predicate function as #item.

  • contains([1, 2, 3, 4], #item == 3) -> true
  • contains([1, 2, 3, 4], #item > 5) -> false

filter

filter(<value1> : array, <value2> : unaryfunction) => array

筛选出数组中不满足所提供谓词的元素。Filters elements out of the array that do not meet the provided predicate. Filter 需要引用谓词函数中的一个元素作为 #item。Filter expects a reference to one element in the predicate function as #item.

  • filter([1, 2, 3, 4], #item > 2) -> [3, 4]
  • filter(['a', 'b', 'c', 'd'], #item == 'a' || #item == 'b') -> ['a', 'b']

find

find(<value1> : array, <value2> : unaryfunction) => any

查找数组中与条件相匹配的第一项。Find the first item from an array that match the condition. 它采用筛选器函数,你可以在其中将数组中的项作为 #item 进行寻址。It takes a filter function where you can address the item in the array as #item. 对于深层嵌套的映射,可使用 #item_n(#item_1, #item_2...) 表示法引用父映射。For deeply nested maps you can refer to the parent maps using the #item_n(#item_1, #item_2...) notation.

  • find([10, 20, 30], #item > 10) -> 20
  • find(['azure', 'data', 'factory'], length(#item) > 4) -> 'azure'
  • find([ @( name = 'Daniel', types = [ @(mood = 'jovial', behavior = 'terrific'), @(mood = 'grumpy', behavior = 'bad') ] ), @( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] ) ], contains(#item.types, #item.mood=='happy') /*Filter out the happy kid*/ )
  • @( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] )

in

in(<array of items> : array, <item to find> : any) => boolean

检查某个项是否 is in the array.
* in([10, 20, 30], 10) -> true
* ``in(['good', 'kid'], 'bad') -> fa

Checks if an item is in the array.
* in([10, 20, 30], 10) -> true
* ``in(['good', 'kid'], 'bad') -> fa
lse``


map

map(<value1> : array, <value2> : unaryfunction) => any

使用提供的表达式将数组的每个元素映射到新元素。

Maps each element of the array to a new element using the provided expression.
Map 需要引用表达式函数中的一个元素作为 #item。nction as #item.
* map([1, 2, 3, 4], #item + 2) -> [3, 4, 5, 6]
* ``map(['a', 'b', 'c', 'd'], #item + '_processed') -> ['a_processed', 'b_processed', 'c_processed', 'd_processe(备注:原文网址为https://docs.microsoft.com/zh-cn/azure/data-factory/data-flow-expression-functions,tag 处理有误。)
Map expects a reference to one element in the expression function as #item.
* map([1, 2, 3, 4], #item + 2) -> [3, 4, 5, 6]
* ``map(['a', 'b', 'c', 'd'], #item + '_processed') -> ['a_processed', 'b_processed', 'c_processed', 'd_processe
d']``


mapIf

mapIf ( : array, : binaryfunction, : binaryFunction) => any

有条件地将数组映射到长度相同或更短的另一个数组。

Conditionally maps an array to another array of same or smaller length.
这些值可以是任何数据类型,包括 structTypes。The values can be of any datatype including structTypes. 它采用一个映射函数,在其中可以将数组中的项作为 #item 进行寻址,将当前索引作为 #index 进行寻址。It takes a mapping function where you can address the item in the array as #item and current index as #index. 对于深层嵌套的映射,可使用 t maps using the #item_[n](#item_1, #index_1...) notation. * mapIf([10, 20, 30], #item > 10, #item + 5) -> [25, 35] * ``mapIf(['icecream', 'cake', 'soda'], length(#item) > 4, upper(#item)) -> ['ICECREAM', 'C 表示法引用父映射。For deeply nested maps you can refer to the parent maps using the #item_[n](#item_1, #index_1...) notation. * mapIf([10, 20, 30], #item > 10, #item + 5) -> [25, 35] * ``mapIf(['icecream', 'cake', 'soda'], length(#item) > 4, upper(#item)) -> ['ICECREAM', 'CAKE']``


mapIndex

mapIndex(<value1> : array, <value2> : binaryfunction) => any

使用提供的表达式将数组的每个元素映射到新元素。

Maps each element of the array to a new element using the provided expression.
Map 需要引用表达式函数中的一个元素作为 #item,并需要引用元素索引作为 #index。index as #index.
* ``mapIndex([1, 2, 3, 4], #item + 2 + #index) -> [4, 6, 8,
Map expects a reference to one element in the expression function as #item and a reference to the element index as #index.
* ``mapIndex([1, 2, 3, 4], #item + 2 + #index) -> [4, 6, 8,
10]``


mapLoop

mapLoop( : integer, : unaryfunction) => any

从 1 到某个长度进行循环,以创建具有该长度的数组。

Loops through from 1 to length to create an array of that length.
它采用一个映射函数,在其中可以将数组中的索引作为 #index 进行寻址。It takes a mapping function where you can address the index in the array as #index. 对于深层嵌套的映射,可使用 #index_n(#index_1, #index_2...) notation. * ``mapLoop(3, #index * 10) -> [10, 20 引用父映射。For deeply nested maps you can refer to the parent maps using the #index_n(#index_1, #index_2...) notation. * ``mapLoop(3, #index * 10) -> [10, 20, 30]``


reduce

reduce(<value1> : array, <value2> : any, <value3> : binaryfunction, <value4> : unaryfunction) => any

累积数组中的元素。

Accumulates elements in an array.
Reduce 需要引用第一个表达式函数中的累加器和一个元素作为 #acc 和 #item,并且需要在第二个表达式函数中使用结果值作为 #result。ession function.
* ``toString(reduce(['1', '2', '3', '4'], '0', #acc + #item, #result)) -> '012
Reduce expects a reference to an accumulator and one element in the first expression function as #acc and #item and it expects the resulting value as #result to be used in the second expression function.
* ``toString(reduce(['1', '2', '3', '4'], '0', #acc + #item, #result)) -> '012
34'``


size

size(<value1> : any) => integer

查找数组或映射类型的大小 rray or map type
* size(['element1', 'element2']) -> 2 * ``size([1,2,3])

Finds the size of an array or map type
* size(['element1', 'element2']) -> 2 * ``size([1,2,3])
-> 3``


slice

slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array

从位置提取数组的子集。

Extracts a subset of an array from a position.
位置从 1 开始。Position is 1 based. 如果省略长度,则默认为字符串的末尾。d of the string.
* slice([10, 20, 30, 40], 1, 2) -> [10, 20]
* slice([10, 20, 30, 40], 2) -> [20, 30, 40]
* slice([10, 20, 30, 40], 2)[1] -> 20
* isNull(slice([10, 20, 30, 40], 2)[0]) -> true
* isNull(slice([10, 20, 30, 40], 2)[20]) -> true
* ``slice(['a', 'b', 'c', 'd'], 8) ->
If the length is omitted, it is defaulted to end of the string.
* slice([10, 20, 30, 40], 1, 2) -> [10, 20]
* slice([10, 20, 30, 40], 2) -> [20, 30, 40]
* slice([10, 20, 30, 40], 2)[1] -> 20
* isNull(slice([10, 20, 30, 40], 2)[0]) -> true
* isNull(slice([10, 20, 30, 40], 2)[20]) -> true
* ``slice(['a', 'b', 'c', 'd'], 8) ->
[]``


sort

sort(<value1> : array, <value2> : binaryfunction) => array

使用提供的谓词函数对数组进行排序。

Sorts the array using the provided predicate function.
Sort 需要引用表达式函数中的两个连续元素作为 #item1 和 #item2。tem1 and #item2.
* sort([4, 8, 2, 3], compare(#item1, #item2)) -> [2, 3, 4, 8]
* sort(['a3', 'b2', 'c1'], iif(right(#item1, 1) >= right(#item2, 1), 1, -1)) -> ['c1', 'b2', 'a3']* @( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] )


in

in(<array of items> : array, <item to find> : any) => boolean

Checks if an item is in the array.
* in([10, 20, 30], 10) -> true
* in(['good', 'kid'], 'bad') -> false


map

map(<value1> : array, <value2> : unaryfunction) => any

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item.
* map([1, 2, 3, 4], #item + 2) -> [3, 4, 5, 6]
* map(['a', 'b', 'c', 'd'], #item + '_processed') -> ['a_processed', 'b_processed', 'c_processed', 'd_processed']


mapIf

mapIf ( : array, : binaryfunction, : binaryFunction) => any

Conditionally maps an array to another array of same or smaller length. The values can be of any datatype including structTypes. It takes a mapping function where you can address the item in the array as #item and current index as #index. For deeply nested maps you can refer to the parent maps using the #item_[n](#item_1, #index_1...) notation. * mapIf([10, 20, 30], #item > 10, #item + 5) -> [25, 35] * mapIf(['icecream', 'cake', 'soda'], length(#item) > 4, upper(#item)) -> ['ICECREAM', 'CAKE']


mapIndex

mapIndex(<value1> : array, <value2> : binaryfunction) => any

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item and a reference to the element index as #index.
* mapIndex([1, 2, 3, 4], #item + 2 + #index) -> [4, 6, 8, 10]


mapLoop

mapLoop( : integer, : unaryfunction) => any

Loops through from 1 to length to create an array of that length. It takes a mapping function where you can address the index in the array as #index. For deeply nested maps you can refer to the parent maps using the #index_n(#index_1, #index_2...) notation. * mapLoop(3, #index * 10) -> [10, 20, 30]


reduce

reduce(<value1> : array, <value2> : any, <value3> : binaryfunction, <value4> : unaryfunction) => any

Accumulates elements in an array. Reduce expects a reference to an accumulator and one element in the first expression function as #acc and #item and it expects the resulting value as #result to be used in the second expression function.
* toString(reduce(['1', '2', '3', '4'], '0', #acc + #item, #result)) -> '01234'


size

size(<value1> : any) => integer

Finds the size of an array or map type
* size(['element1', 'element2']) -> 2 * size([1,2,3]) -> 3


slice

slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array

Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.
* slice([10, 20, 30, 40], 1, 2) -> [10, 20]
* slice([10, 20, 30, 40], 2) -> [20, 30, 40]
* slice([10, 20, 30, 40], 2)[1] -> 20
* isNull(slice([10, 20, 30, 40], 2)[0]) -> true
* isNull(slice([10, 20, 30, 40], 2)[20]) -> true
* slice(['a', 'b', 'c', 'd'], 8) -> []


sort

sort(<value1> : array, <value2> : binaryfunction) => array

Sorts the array using the provided predicate function. Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.
* sort([4, 8, 2, 3], compare(#item1, #item2)) -> [2, 3, 4, 8]
* sort(['a3', 'b2', 'c1'], iif(right(#item1, 1) >= right(#item2, 1), 1, -1)) -> ['c1', 'b2', 'a3']Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.
* sort([4, 8, 2, 3], compare(#item1, #item2)) -> [2, 3, 4, 8]
* sort(['a3', 'b2', 'c1'], iif(right(#item1, 1) >= right(#item2, 1), 1, -1)) -> ['c1', 'b2', 'a3']* @( name = 'Mark', types = [ @(mood = 'happy', behavior = 'awesome'), @(mood = 'calm', behavior = 'reclusive') ] )


in

in(<array of items> : array, <item to find> : any) => boolean

Checks if an item is in the array.
* in([10, 20, 30], 10) -> true
* in(['good', 'kid'], 'bad') -> false


map

map(<value1> : array, <value2> : unaryfunction) => any

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item.
* map([1, 2, 3, 4], #item + 2) -> [3, 4, 5, 6]
* map(['a', 'b', 'c', 'd'], #item + '_processed') -> ['a_processed', 'b_processed', 'c_processed', 'd_processed']


mapIf

mapIf ( : array, : binaryfunction, : binaryFunction) => any

Conditionally maps an array to another array of same or smaller length. The values can be of any datatype including structTypes. It takes a mapping function where you can address the item in the array as #item and current index as #index. For deeply nested maps you can refer to the parent maps using the #item_[n](#item_1, #index_1...) notation. * mapIf([10, 20, 30], #item > 10, #item + 5) -> [25, 35] * mapIf(['icecream', 'cake', 'soda'], length(#item) > 4, upper(#item)) -> ['ICECREAM', 'CAKE']


mapIndex

mapIndex(<value1> : array, <value2> : binaryfunction) => any

Maps each element of the array to a new element using the provided expression. Map expects a reference to one element in the expression function as #item and a reference to the element index as #index.
* mapIndex([1, 2, 3, 4], #item + 2 + #index) -> [4, 6, 8, 10]


mapLoop

mapLoop( : integer, : unaryfunction) => any

Loops through from 1 to length to create an array of that length. It takes a mapping function where you can address the index in the array as #index. For deeply nested maps you can refer to the parent maps using the #index_n(#index_1, #index_2...) notation. * mapLoop(3, #index * 10) -> [10, 20, 30]


reduce

reduce(<value1> : array, <value2> : any, <value3> : binaryfunction, <value4> : unaryfunction) => any

Accumulates elements in an array. Reduce expects a reference to an accumulator and one element in the first expression function as #acc and #item and it expects the resulting value as #result to be used in the second expression function.
* toString(reduce(['1', '2', '3', '4'], '0', #acc + #item, #result)) -> '01234'


size

size(<value1> : any) => integer

Finds the size of an array or map type
* size(['element1', 'element2']) -> 2 * size([1,2,3]) -> 3


slice

slice(<array to slice> : array, <from 1-based index> : integral, [<number of items> : integral]) => array

Extracts a subset of an array from a position. Position is 1 based. If the length is omitted, it is defaulted to end of the string.
* slice([10, 20, 30, 40], 1, 2) -> [10, 20]
* slice([10, 20, 30, 40], 2) -> [20, 30, 40]
* slice([10, 20, 30, 40], 2)[1] -> 20
* isNull(slice([10, 20, 30, 40], 2)[0]) -> true
* isNull(slice([10, 20, 30, 40], 2)[20]) -> true
* slice(['a', 'b', 'c', 'd'], 8) -> []


sort

sort(<value1> : array, <value2> : binaryfunction) => array

Sorts the array using the provided predicate function. Sort expects a reference to two consecutive elements in the expression function as #item1 and #item2.
* sort([4, 8, 2, 3], compare(#item1, #item2)) -> [2, 3, 4, 8]
* sort(['a3', 'b2', 'c1'], iif(right(#item1, 1) >= right(#item2, 1), 1, -1)) -> ['c1', 'b2', 'a3']

缓存的 lookup 函数Cached lookup functions

只有在包含缓存的接收器的情况下使用缓存的查找时,以下函数才可用。The following functions are only available when using a cached lookup when you've included a cached sink.


lookup

lookup(key, key2, ...) => complex[]

使用与缓存接收器中的键匹配的指定键查找缓存接收器中的第一行。Looks up the first row from the cached sink using the specified keys that match the keys from the cached sink.

  • cacheSink#lookup(movieId)

mlookup

mlookup(key, key2, ...) => complex[]

使用与缓存接收器中的键匹配的指定键查找缓存接收器中的所有匹配行。Looks up the all matching rows from the cached sink using the specified keys that match the keys from the cached sink.

  • cacheSink#mlookup(movieId)

output

output() => any

返回缓存接收器结果的第一行 * cacheSink#output()Returns the first row of the results of the cache sink * cacheSink#output()


outputs

output() => any

返回缓存接收器结果的整个输出行集 * cacheSink#outputs() ___Returns the entire output row set of the results of the cache sink * cacheSink#outputs() ___

转换函数Conversion functions

转换函数用于转换数据和数据类型测试Conversion functions are used to convert data and test for data types

isBitSet

isBitSet ( : array, :integer ) => boolean

检查是否在此位集中设置了位的位置 * isBitSet(toBitSet([10, 32, 98]), 10) => true


setBitSet

setBitSet ( : array, :array) => arrayChecks if a bit position is set in this bitset * isBitSet(toBitSet([10, 32, 98]), 10) => true


setBitSet

setBitSet ( : array, :array) => array

在此位集中设置位的位置 * setBitSet(toBitSet([10, 32]), [98]) => [4294968320L, 17179869184L]


isBoolean

isBoolean( : string) => booleanSets bit positions in this bitset * setBitSet(toBitSet([10, 32]), [98]) => [4294968320L, 17179869184L]


isBoolean

isBoolean( : string) => boolean

根据 toBoolean() * isBoolean('true') -> true * isBoolean('no') -> true * isBoolean('microsoft') -> false


isByte

isByte( : string) => boolean 规则检查 string 值是否是 boolean 值Checks if the string value is a boolean value according to the rules of toBoolean() * isBoolean('true') -> true * isBoolean('no') -> true * isBoolean('microsoft') -> false


isByte

isByte( : string) => boolean

根据 toByte() * isByte('123') -> true * isByte('chocolate') -> false


isDate

isDate ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 byte 值Checks if the string value is a byte value given an optional format according to the rules of toByte() * isByte('123') -> true * isByte('chocolate') -> false


isDate

isDate ( : string, [: string]) => boolean

使用可选输入日期格式检查输入日期字符串是否为日期。Checks if the input date string is a date using an optional input date format. 有关可用格式,请参阅 Java 的 SimpleDateFormat。Refer Java's SimpleDateFormat for available formats. 如果省略输入日期格式,则默认格式为 yyyy-[M]M-[d]dIf the input date format is omitted, default format is yyyy-[M]M-[d]d. 接受的格式为 [ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ] * isDate('2012-8-18') -> true * isDate('12/18--234234' -> 'MM/dd/yyyy') -> false


isShort

isShort ( : string, [: string]) => booleanAccepted formats are [ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ] * isDate('2012-8-18') -> true * isDate('12/18--234234' -> 'MM/dd/yyyy') -> false


isShort

isShort ( : string, [: string]) => boolean

根据 toShort() * isShort('123') -> true * isShort('$123' -> '$###') -> true * isShort('microsoft') -> false


isInteger

isInteger ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 short 值Checks of the string value is a short value given an optional format according to the rules of toShort() * isShort('123') -> true * isShort('$123' -> '$###') -> true * isShort('microsoft') -> false


isInteger

isInteger ( : string, [: string]) => boolean

根据 toInteger() * isInteger('123') -> true * isInteger('$123' -> '$###') -> true * isInteger('microsoft') -> false


isLong

isLong ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 integer 值Checks of the string value is a integer value given an optional format according to the rules of toInteger() * isInteger('123') -> true * isInteger('$123' -> '$###') -> true * isInteger('microsoft') -> false


isLong

isLong ( : string, [: string]) => boolean

根据 toLong() * isLong('123') -> true * isLong('$123' -> '$###') -> true * isLong('gunchus') -> false


isFloat

isFloat ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 long 值Checks of the string value is a long value given an optional format according to the rules of toLong() * isLong('123') -> true * isLong('$123' -> '$###') -> true * isLong('gunchus') -> false


isFloat

isFloat ( : string, [: string]) => boolean

根据 toFloat() * isFloat('123') -> true * isFloat('$123.45' -> '$###.00') -> true * isFloat('icecream') -> false


isDouble

isDouble ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 float 值Checks of the string value is a float value given an optional format according to the rules of toFloat() * isFloat('123') -> true * isFloat('$123.45' -> '$###.00') -> true * isFloat('icecream') -> false


isDouble

isDouble ( : string, [: string]) => boolean

根据 toDouble() * isDouble('123') -> true * isDouble('$123.45' -> '$###.00') -> true * isDouble('icecream') -> false


isDecimal

isDecimal ( : string) => boolean 规则检查 string 值是否是给定可选格式的 double 值Checks of the string value is a double value given an optional format according to the rules of toDouble() * isDouble('123') -> true * isDouble('$123.45' -> '$###.00') -> true * isDouble('icecream') -> false


isDecimal

isDecimal ( : string) => boolean

根据 toDecimal() * isDecimal('123.45') -> true * isDecimal('12/12/2000') -> false


isTimestamp

isTimestamp ( : string, [: string]) => boolean 规则检查 string 值是否是给定可选格式的 decimal 值Checks of the string value is a decimal value given an optional format according to the rules of toDecimal() * isDecimal('123.45') -> true * isDecimal('12/12/2000') -> false


isTimestamp

isTimestamp ( : string, [: string]) => boolean

使用可选输入时间戳格式检查输入日期字符串是否为时间戳。Checks if the input date string is a timestamp using an optional input timestamp format. 有关可用格式,请参阅 Java 的 SimpleDateFormat。Refer to Java's SimpleDateFormat for available formats. 如果省略时间戳,则使用默认模式 yyyy-[M]M-[d]d hh:mm:ss[.f...]If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 时间戳最多支持毫秒准确度,值为 999。有关可用格式,请参阅 Java 的 SimpleDateFormat。Timestamp supports up to millisecond accuracy with value of 999 Refer to Java's SimpleDateFormat for available formats.

  • isTimestamp('2016-12-31 00:12:00') -> true
  • isTimestamp('2016-12-31T00:12:00' -> 'yyyy-MM-dd\\'T\\'HH:mm:ss' -> 'PST') -> true
  • isTimestamp('2012-8222.18') -> false

toBase64

toBase64(<value1> : string) => string

以 base64 编码给定字符串。Encodes the given string in base64.

  • toBase64('bojjus') -> 'Ym9qanVz'

toBinary

toBinary(<value1> : any) => binary

将任何数字/日期/时间戳/字符串转换为二进制表示形式。Converts any numeric/date/timestamp/string to binary representation.

  • toBinary(3) -> [0x11]

toBoolean

toBoolean(<value1> : string) => boolean

可将 ('t', 'true', 'y', 'yes', '1') 值转换为 true,将 ('f', 'false', 'n', 'no', '0') 值转换为 false,将其他任何值转换为 NULL。Converts a value of ('t', 'true', 'y', 'yes', '1') to true and ('f', 'false', 'n', 'no', '0') to false and NULL for any other value.

  • toBoolean('true') -> true
  • toBoolean('n') -> false
  • isNull(toBoolean('truthy')) -> true

toByte

toByte(<value> : any, [<format> : string], [<locale> : string]) => byte

将任何数字或字符串转换为字节值。Converts any numeric or string to a byte value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion.

  • toByte(123)
  • 123
  • toByte(0xFF)
  • -1
  • toByte('123')
  • 123

toDate

toDate(<string> : any, [<date format> : string]) => date

使用可选输入日期格式将输入日期字符串转换为日期。Converts input date string to date using an optional input date format. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. 如果省略输入日期格式,则默认格式为 yyyy-[M]M-[d]d。If the input date format is omitted, default format is yyyy-[M]M-[d]d. 接受的格式包括:[ yyyy、yyyy-[M]M、yyyy-[M]M-[d]d、yyyy-[M]M-[d]dT* ]。Accepted formats are :[ yyyy, yyyy-[M]M, yyyy-[M]M-[d]d, yyyy-[M]M-[d]dT* ].

  • toDate('2012-8-18') -> toDate('2012-08-18')
  • toDate('12/18/2012', 'MM/dd/yyyy') -> toDate('2012-12-18')

toDecimal

toDecimal(<value> : any, [<precision> : integral], [<scale> : integral], [<format> : string], [<locale> : string]) => decimal(10,0)

将任何数字或字符串转换为小数值。Converts any numeric or string to a decimal value. 如果未指定精度和小数位数,则默认为 (10,2)。可以使用可选的 Java 十进制格式进行转换。If precision and scale are not specified, it is defaulted to (10,2).An optional Java decimal format can be used for the conversion. 采用 BCP47 语言形式(如 en-US、de、zh-CN)的可选区域设置格式。An optional locale format in the form of BCP47 language like en-US, de, zh-CN.

  • toDecimal(123.45) -> 123.45
  • toDecimal('123.45', 8, 4) -> 123.4500
  • toDecimal('$123.45', 8, 4,'$###.00') -> 123.4500
  • toDecimal('Ç123,45', 10, 2, 'Ç###,##', 'de') -> 123.45

toDouble

toDouble(<value> : any, [<format> : string], [<locale> : string]) => double

将任何数字或字符串转换为双精度值。Converts any numeric or string to a double value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion. 采用 BCP47 语言形式(如 en-US、de、zh-CN)的可选区域设置格式。An optional locale format in the form of BCP47 language like en-US, de, zh-CN.

  • toDouble(123.45) -> 123.45
  • toDouble('123.45') -> 123.45
  • toDouble('$123.45', '$###.00') -> 123.45
  • toDouble('Ç123,45', 'Ç###,##', 'de') -> 123.45

toFloat

toFloat(<value> : any, [<format> : string], [<locale> : string]) => float

将任何数字或字符串转换为浮点值。Converts any numeric or string to a float value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion. 截断任何双精度数。Truncates any double.

  • toFloat(123.45) -> 123.45f
  • toFloat('123.45') -> 123.45f
  • toFloat('$123.45', '$###.00') -> 123.45f

toInteger

toInteger(<value> : any, [<format> : string], [<locale> : string]) => integer

将任何数字或字符串转换为整数值。Converts any numeric or string to an integer value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion. 截断任何长整型数、浮点数、双精度数。Truncates any long, float, double.

  • toInteger(123) -> 123
  • toInteger('123') -> 123
  • toInteger('$123', '$###') -> 123

toLong

toLong(<value> : any, [<format> : string], [<locale> : string]) => long

将任何数字或字符串转换为长值。Converts any numeric or string to a long value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion. 截断任何浮点数、双精度数。Truncates any float, double.

  • toLong(123) -> 123
  • toLong('123') -> 123
  • toLong('$123', '$###') -> 123

toShort

toShort(<value> : any, [<format> : string], [<locale> : string]) => short

将任何数字或字符串转换为短值。Converts any numeric or string to a short value. 可以使用可选的 Java 十进制格式进行转换。An optional Java decimal format can be used for the conversion. 截断任何整数、长整型数、浮点数、双精度数。Truncates any integer, long, float, double.

  • toShort(123) -> 123
  • toShort('123') -> 123
  • toShort('$123', '$###') -> 123

toString

toString(<value> : any, [<number format/date format> : string]) => string

将基元数据类型转换为字符串。Converts a primitive datatype to a string. 对于数字和日期,可以指定格式。For numbers and date a format can be specified. 如果未指定,则选择系统默认值。对数字使用 Java 十进制格式。If unspecified the system default is picked.Java decimal format is used for numbers. 有关所有可能的日期格式,请参阅 Java SimpleDateFormat;默认格式为 yyyy-MM-dd。Refer to Java SimpleDateFormat for all possible date formats; the default format is yyyy-MM-dd.

  • toString(10) -> '10'
  • toString('engineer') -> 'engineer'
  • toString(123456.789, '##,###.##') -> '123,456.79'
  • toString(123.78, '000000.000') -> '000123.780'
  • toString(12345, '##0.#####E0') -> '12.345E3'
  • toString(toDate('2018-12-31')) -> '2018-12-31'
  • isNull(toString(toDate('2018-12-31', 'MM/dd/yy'))) -> true
  • toString(4 == 20) -> 'false'

toTimestamp

toTimestamp(<string> : any, [<timestamp format> : string], [<time zone> : string]) => timestamp

根据可选的时间戳格式将字符串转换为时间戳。Converts a string to a timestamp given an optional timestamp format. 如果省略时间戳,则使用默认模式 yyyy-[M]M-[d]d hh:mm:ss[.f...]。If the timestamp is omitted the default pattern yyyy-[M]M-[d]d hh:mm:ss[.f...] is used. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 时间戳支持高达 999 毫秒的精度值。Timestamp supports up to millisecond accuracy with value of 999. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • toTimestamp('2016-12-31 00:12:00') -> toTimestamp('2016-12-31 00:12:00')
  • toTimestamp('2016-12-31T00:12:00', 'yyyy-MM-dd\'T\'HH:mm:ss', 'PST') -> toTimestamp('2016-12-31 00:12:00')
  • toTimestamp('12/31/2016T00:12:00', 'MM/dd/yyyy\'T\'HH:mm:ss') -> toTimestamp('2016-12-31 00:12:00')
  • millisecond(toTimestamp('2019-02-03 05:19:28.871', 'yyyy-MM-dd HH:mm:ss.SSS')) -> 871

toUTC

toUTC(<value1> : timestamp, [<value2> : string]) => timestamp

将时间戳转换为 UTC。Converts the timestamp to UTC. 可以“GMT”、“PST”、“UTC”、“America/Cayman”格式传递可选的时区。You can pass an optional timezone in the form of 'GMT', 'PST', 'UTC', 'America/Cayman'. 默认为当前时区。It is defaulted to the current timezone. 如需查看可用的格式,请参阅 Java 的 SimpleDateFormat 类。Refer Java's SimpleDateFormat class for available formats. https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.

  • toUTC(currentTimestamp()) == toTimestamp('2050-12-12 19:18:12') -> false
  • toUTC(currentTimestamp(), 'Asia/Seoul') != toTimestamp('2050-12-12 19:18:12') -> true

元函数Metafunctions

元函数主要用于处理数据流中的元数据Metafunctions primarily function on metadata in your data flow

byItem

byItem(<parent column> : any, <column name> : string) => any

在结构或结构数组中查找子项,如果有多个匹配项,则返回第一个匹配项。Find a sub item within a structure or array of structure If there are multiple matches, the first match is returned. 如果没有匹配项,则返回 NULL 值。If no match it returns a NULL value. 返回的值必须是由某种类型转换操作(?The returned value has to be type converted by one of the type conversion actions(? date、?date, ? string…)转换的类型。设计时已知的列名应该只按名称进行寻址。string ...). Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换 * byItem( byName('customer'), 'orderItems') ? (itemName as string, itemQty as integer) * ````

  • byItem( byItem( byName('customer'), 'orderItems'), 'itemName') ? string

byOrigin

byOrigin(<column name> : string, [<origin stream name> : string]) => anyComputed inputs are not supported but you can use parameter substitutions * byItem( byName('customer'), 'orderItems') ? (itemName as string, itemQty as integer) * ````

  • byItem( byItem( byName('customer'), 'orderItems'), 'itemName') ? string

byOrigin

byOrigin(<column name> : string, [<origin stream name> : string]) => any

在源流中按名称选择列值。Selects a column value by name in the origin stream. 第二个参数是源流名称。The second argument is the origin stream name. 如果有多个匹配项,则返回第一个匹配项。If there are multiple matches, the first match is returned. 如果没有匹配项,则返回 NULL 值。If no match it returns a NULL value. 返回的值必须是由某种类型转换函数(TO_DATE、TO_STRING...)转换的类型。设计时已知的列名应该只按名称进行寻址。The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • toString(byOrigin('ancestor', 'ancestorStream'))

byOrigins

byOrigins(<column names> : array, [<origin stream name> : string]) => any

在流中按名称选择列的数组。Selects an array of columns by name in the stream. 第二个参数是其源流。The second argument is the stream where it originated from. 如果有多个匹配项,则返回第一个匹配项。If there are multiple matches, the first match is returned. 如果没有匹配项,则返回 NULL 值。If no match it returns a NULL value. 返回的值必须是由某种类型转换函数(TO_DATE、TO_STRING…)转换的类型设计时已知的列名应该只按名称进行寻址。The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • toString(byOrigins(['ancestor1', 'ancestor2'], 'ancestorStream'))

byName

byName(<column name> : string, [<stream name> : string]) => any

在流中按名称选择列值。Selects a column value by name in the stream. 可以将可选流名称作为第二个参数传递。You can pass a optional stream name as the second argument. 如果有多个匹配项,则返回第一个匹配项。If there are multiple matches, the first match is returned. 如果没有匹配项,则返回 NULL 值。If no match it returns a NULL value. 返回的值必须是由某种类型转换函数(TO_DATE、TO_STRING...)转换的类型。设计时已知的列名应该只按名称进行寻址。The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...). Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • toString(byName('parent'))
  • toLong(byName('income'))
  • toBoolean(byName('foster'))
  • toLong(byName($debtCol))
  • toString(byName('Bogus Column'))
  • toString(byName('Bogus Column', 'DeriveStream'))

byNames

byNames(<column names> : array, [<stream name> : string]) => any

在流中按名称选择列的数组。Select an array of columns by name in the stream. 可以将可选流名称作为第二个参数传递。You can pass a optional stream name as the second argument. 如果有多个匹配项,则返回第一个匹配项。If there are multiple matches, the first match is returned. 如果列没有匹配项,则整个输出为 NULL 值。If there are no matches for a column, the entire output is a NULL value. 返回的值需要类型转换函数(toDate、toString...)。设计时已知的列名应该只按名称进行寻址。The returned value requires a type conversion functions (toDate, toString, ...). Column names known at design time should be addressed just by their name. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • toString(byNames(['parent', 'child']))
  • byNames(['parent']) ? string
  • toLong(byNames(['income']))
  • byNames(['income']) ? long
  • toBoolean(byNames(['foster']))
  • toLong(byNames($debtCols))
  • toString(byNames(['a Column']))
  • toString(byNames(['a Column'], 'DeriveStream'))
  • byNames(['orderItem']) ? (itemName as string, itemQty as integer)

byPath

byPath(<value1> : string, [<streamName> : string]) => any

在流中按名称查找分层路径。Finds a hierarchical path by name in the stream. 可以将可选流名称作为第二个参数传递。You can pass an optional stream name as the second argument. 如果找不到此类路径,则返回 null。If no such path is found it returns null. 设计时已知的列名/路径应该仅通过其名称或点表示法路径来寻址。Column names/paths known at design time should be addressed just by their name or dot notation path. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • byPath('grandpa.parent.child') => column

byPosition

byPosition(<position> : integer) => any

根据列在流中的相对位置(从 1 开始)选择列值。Selects a column value by its relative position(1 based) in the stream. 如果位置超出界限,则返回 NULL 值。If the position is out of bounds it returns a NULL value. 返回的值必须是由某种类型转换函数(TO_DATE、TO_STRING…)转换的类型不支持计算输入,但可以使用参数替换。The returned value has to be type converted by one of the type conversion functions(TO_DATE, TO_STRING ...) Computed inputs are not supported but you can use parameter substitutions.

  • toString(byPosition(1))
  • toDecimal(byPosition(2), 10, 2)
  • toBoolean(byName(4))
  • toString(byName($colName))
  • toString(byPosition(1234))

hasPath

hasPath(<value1> : string, [<streamName> : string]) => boolean

按名称检查流中是否存在某个分层路径。Checks if a certain hierarchical path exists by name in the stream. 可以将可选流名称作为第二个参数传递。You can pass an optional stream name as the second argument. 设计时已知的列名/路径应该仅通过其名称或点表示法路径来寻址。Column names/paths known at design time should be addressed just by their name or dot notation path. 不支持计算输入,但可以使用参数替换。Computed inputs are not supported but you can use parameter substitutions.

  • hasPath('grandpa.parent.child') => boolean

hex

hex(: binary) => string

返回二进制值的十六进制字符串表示形式 * hex(toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)])) -> '1fadbe'


unhex

unhex(: string) => binaryReturns a hex string representation of a binary value * hex(toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)])) -> '1fadbe'


unhex

unhex(: string) => binary

对字符串表示形式的二进制值进行反向十六进制转换。Unhexes a binary value from its string representation. 这可以用于联合 sha2 和 md5 从字符串表示形式转换为二进制表示形式 * unhex('1fadbe') -> toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)]) * unhex(md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4'))) -> toBinary([toByte(0x4c),toByte(0xe8),toByte(0xa8),toByte(0x80),toByte(0xbd),toByte(0x62),toByte(0x1a),toByte(0x1f),toByte(0xfa),toByte(0xd0),toByte(0xbc),toByte(0xa9),toByte(0x05),toByte(0xe1),toByte(0xbc),toByte(0x5a)])This can be used in conjunction with sha2, md5 to convert from string to binary representation * unhex('1fadbe') -> toBinary([toByte(0x1f), toByte(0xad), toByte(0xbe)]) * unhex(md5(5, 'gunchus', 8.2, 'bojjus', true, toDate('2010-4-4'))) -> toBinary([toByte(0x4c),toByte(0xe8),toByte(0xa8),toByte(0x80),toByte(0xbd),toByte(0x62),toByte(0x1a),toByte(0x1f),toByte(0xfa),toByte(0xd0),toByte(0xbc),toByte(0xa9),toByte(0x05),toByte(0xe1),toByte(0xbc),toByte(0x5a)])

开窗函数Window functions

以下函数仅可用于窗口转换。The following functions are only available in window transformations.


cumeDist

cumeDist() => integer

CumeDist 函数计算某个值相对于分区中所有值的位置。The CumeDist function computes the position of a value relative to all values in the partition. 结果是前面的行数,或者等于当前行在分区中的顺序除以窗口分区中的总行数。The result is the number of rows preceding or equal to the current row in the ordering of the partition divided by the total number of rows in the window partition. 顺序中的任何关联值将计算为相同的位置。Any tie values in the ordering will evaluate to the same position.

  • cumeDist()

denseRank

denseRank() => integer

计算在窗口的 order by 子句中指定的一组值中的值排名。Computes the rank of a value in a group of values specified in a window's order by clause. 结果是 1 加上前面的行数,或者等于当前行在分区中的顺序。The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. 值不会在序列中生成空隙。The values will not produce gaps in the sequence. 即使数据未排序,也能进行密集排名,并且会查找值的变化。Dense Rank works even when data is not sorted and looks for change in values.

  • denseRank()

lag

lag(<value> : any, [<number of rows to look before> : number], [<default value> : any]) => any

获取当前行之前由第一个参数计算的 n 行的值。Gets the value of the first parameter evaluated n rows before the current row. 第二个参数是要反向查找的行数,默认值为 1。The second parameter is the number of rows to look back and the default value is 1. 如果行数不多,则返回 NULL 值,除非指定了默认值。If there are not as many rows a value of null is returned unless a default value is specified.

  • lag(amount, 2)
  • lag(amount, 2000, 100)

lead

lead(<value> : any, [<number of rows to look after> : number], [<default value> : any]) => any

获取当前行之后由第一个参数计算的 n 行的值。Gets the value of the first parameter evaluated n rows after the current row. 第二个参数是要正向查找的行数,默认值为 1。The second parameter is the number of rows to look forward and the default value is 1. 如果行数不多,则返回 NULL 值,除非指定了默认值。If there are not as many rows a value of null is returned unless a default value is specified.

  • lead(amount, 2)
  • lead(amount, 2000, 100)

nTile

nTile([<value1> : integer]) => integer

NTile 函数将每个窗口分区的行分割为从 1 到 nn 桶。The NTile function divides the rows for each window partition into n buckets ranging from 1 to at most n. 桶值最大相差 1。Bucket values will differ by at most 1. 如果分区中的行数不能均匀分割成桶数,则余值将逐个分布在每个桶中,从第一个桶开始。If the number of rows in the partition does not divide evenly into the number of buckets, then the remainder values are distributed one per bucket, starting with the first bucket. NTile 函数适合用于计算 tertiles、四分位数、十分位数和其他常见的摘要统计数据。The NTile function is useful for the calculation of tertiles, quartiles, deciles, and other common summary statistics. 在初始化期间,该函数将计算两个变量:常规桶的大小将额外添加一行。The function calculates two variables during initialization: The size of a regular bucket will have one extra row added to it. 这两个变量都以当前分区的大小为基础。Both variables are based on the size of the current partition. 在计算过程中,该函数将跟踪当前行号、当前桶号,以及发生桶更改的行号 (bucketThreshold)。During the calculation process the function keeps track of the current row number, the current bucket number, and the row number at which the bucket will change (bucketThreshold). 如果当前行号达到桶的阈值,则桶值将会加 1,阈值将按桶大小增加(如果当前桶已填充,则额外加 1)。When the current row number reaches bucket threshold, the bucket value is increased by one and the threshold is increased by the bucket size (plus one extra if the current bucket is padded).

  • nTile()
  • nTile(numOfBuckets)

rank

rank() => integer

计算在窗口的 order by 子句中指定的一组值中的值排名。Computes the rank of a value in a group of values specified in a window's order by clause. 结果是 1 加上前面的行数,或者等于当前行在分区中的顺序。The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. 值将在序列中生成空隙。The values will produce gaps in the sequence. 即使数据未排序,也能进行排名,并且会查找值的变化。Rank works even when data is not sorted and looks for change in values.

  • rank()

rowNumber

rowNumber() => integer

为窗口中的行分配一个顺序行号,从 1 开始。Assigns a sequential row numbering for rows in a window starting with 1.

  • rowNumber()

后续步骤Next steps

了解如何使用表达式生成器Learn how to use Expression Builder.