parse-where 运算符parse-where operator

计算字符串表达式并将其值分析为一个或多个计算列。Evaluates a string expression, and parses its value into one or more calculated columns. 得到的结果只有成功分析的字符串。The result is only the successfully parsed strings.

请参阅 parse 运算符,该运算符对未成功分析的字符串生成 null。See parse operator, which produces nulls for unsuccessfully parsed strings.

T | parse-where Text with "ActivityName=" name ", ActivityType=" type

语法Syntax

T | parse-where [kind=regex [flags=regex_flags] |simple] Expression with * ( StringConstant ColumnName [: ColumnType ]) *...T | parse-where [kind=regex [flags=regex_flags] |simple] Expression with * ( StringConstant ColumnName [: ColumnType ]) *...

参数Arguments

  • T :输入表。T : The input table.

  • kindkind :

    • simple (默认值):StringConstant 是一个正则字符串值,匹配为严格匹配。simple (default): StringConstant is a regular string value, and the match is strict. 所有字符串分隔符都应出现在分析的字符串中,并且所有扩展列都必须与所需类型匹配。All string delimiters should appear in the parsed string, and all extended columns must match the required types.

    • regex :StringConstant 可能是正则表达式,匹配为严格匹配。regex : StringConstant may be a regular expression, and the match is strict. 所有字符串分隔符都应出现在分析的字符串中,并且所有扩展列都必须与所需类型匹配。All string delimiters should appear in the parsed string, and all extended columns must match the required types. 对于此模式,字符串分隔符可以是正则表达式。String delimiters can be a regex for this mode.

    • flags :正则表达式模式中要使用的标志:U (Ungreedy)、m(多行模式)、s(匹配新行 \n)、i(不区分大小写),可在 RE2 标志中找到更多标志。flags : Flags to be used in regex mode: U (Ungreedy), m (multi-line mode), s (match new line \n), i (case-insensitive), More flags can be found in RE2 flags.

  • 表达式 :计算结果为字符串的表达式。Expression : An expression that evaluates to a string.

  • columnName :列名称,分配给从字符串表达式中提取的值。ColumnName: The name of a column that is assigned to a value that was taken out of the string expression.

  • ColumnType: 应为可选的标量类型,指示要将值转换为的类型。ColumnType: should be an optional scalar type that indicates the type to convert the value to. 默认值为字符串类型。The default is string type.

返回Returns

输入表,根据提供给运算符的列的列表进行扩展。The input table, which is extended according to the list of columns that are provided to the operator.

备注

只有成功分析的字符串才会出现在输出中。Only successfully parsed strings will be in the output. 与模式不匹配的字符串将被筛选掉。Strings that don't match the pattern will be filtered out.

提示Tips

  • parse-where 以与 parse 相同的方式分析字符串,并筛选出未成功分析的字符串。parse-where parses the strings in the same way as parse, and filters out strings that were not parsed successfully.

  • 如果还希望删除或重命名某些列,请使用 projectUse project if you also want to drop or rename some columns.

  • 在模式中使用 * 可跳过垃圾值。Use * in the pattern to skip junk values. 此值不能在字符串列后使用。This value can't be used after string column.

  • 除 StringConstant 外,分析模式还可以 ColumnName 开头 。The parse pattern may start with ColumnName , in addition to StringConstant .

  • 如果分析的表达式不是字符串类型,会将其转换为字符串类型。If the parsed Expression isn't of type string, it will be converted to type string.

  • 如果使用正则表达式模式,则可以添加正则表达式标志来控制分析中使用的整个正则表达式。If regex mode is used, you can add regex flags to control the whole regex used in parse.

  • 在正则表达式模式下,分析会将模式转换为正则表达式,并使用 RE2 语法以便使用在内部处理的被捕获的组(有编号)进行匹配。In regex mode, parse will translate the pattern to a regex and use RE2 syntax in order to do the matching using numbered captured groups that are handled internally.

    例如,以下分析语句:For example, this parse statement:

    parse-where kind=regex Col with * <regex1> var1:string <regex2> var2:long
    

    分析过程在内部生成的正则表达式将为 .*?<regex1>(.*?)<regex2>(\-\d+)The regex that will be generated by the parse internally is .*?<regex1>(.*?)<regex2>(\-\d+).

    • * 转换为 .*?* was translated to .*?.

    • string 转换为 .*?string was translated to .*?.

    • long 转换为 \-\d+long was translated to \-\d+.

示例Examples

parse-where 运算符提供了一种简单的方法,可通过对同一 string 表达式使用多个 extract 应用程序来 extend 某个表。The parse-where operator provides a streamlined way to extend a table by using multiple extract applications on the same string expression. 当表中有一个 string 列,其中包含多个要分解为单独列的值时,这会非常有用。This is most useful when the table has a string column that contains several values that you want to break into individual columns. 例如,你可分解开发人员 trace ("printf"/"Console.WriteLine") 语句生成的列。For example, you can break up a column that was produced by a developer trace ("printf"/"Console.WriteLine") statement.

使用 parseUsing parse

在下面的示例中,Traces 表的 EventText 列包含 Event: NotifySliceRelease (resourceName={0}, totalSlices= {1}, sliceNumber={2}, lockTime={3}, releaseTime={4}, previousLockTime={5}) 格式的字符串。In the example below, the column EventText of table Traces contains strings of the form Event: NotifySliceRelease (resourceName={0}, totalSlices= {1}, sliceNumber={2}, lockTime={3}, releaseTime={4}, previousLockTime={5}). 以下操作将用六个列扩展该表:resourceNametotalSlicessliceNumberlockTime releaseTimepreviouLockTimeMonthDayThe operation below will extend the table with six columns: resourceName , totalSlices, sliceNumber, lockTime , releaseTime, previouLockTime, Month, and Day.

有一些字符串未完全匹配。A few of the strings don't have a full match.

如果使用 parse,计算列将具有 null 值。Using parse, the calculated columns will have nulls.

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces  
| parse EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices:long * "sliceNumber=" sliceNumber:long * "lockTime=" lockTime ", releaseTime=" releaseTime:date "," * "previousLockTime=" previouLockTime:date ")" *  
| project resourceName ,totalSlices , sliceNumber , lockTime , releaseTime , previousLockTime
resourceNameresourceName totalSlicestotalSlices sliceNumbersliceNumber lockTimelockTime releaseTimereleaseTime previousLockTimepreviousLockTime
PipelineSchedulerPipelineScheduler 2727 2020 2016/02/17 08:40:0102/17/2016 08:40:01 2016-02-17 08:40:01.00000002016-02-17 08:40:01.0000000 2016-02-17 08:39:01.00000002016-02-17 08:39:01.0000000
PipelineSchedulerPipelineScheduler 2727 2222 2016/02/17 08:41:0102/17/2016 08:41:01 2016-02-17 08:41:00.00000002016-02-17 08:41:00.0000000 2016-02-17 08:40:01.00000002016-02-17 08:40:01.0000000

使用 parse-whereUsing parse-where

如果使用“parse-where”,将从结果中筛选出未成功分析的字符串。Using 'parse-where' will filter-out unsuccessfully parsed strings from the result.

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces  
| parse-where EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices:long * "sliceNumber=" sliceNumber:long * "lockTime=" lockTime ", releaseTime=" releaseTime:date "," * "previousLockTime=" previousLockTime:date ")" *  
| project resourceName ,totalSlices , sliceNumber , lockTime , releaseTime , previousLockTime
resourceNameresourceName totalSlicestotalSlices sliceNumbersliceNumber lockTimelockTime releaseTimereleaseTime previousLockTimepreviousLockTime
PipelineSchedulerPipelineScheduler 2727 2020 2016/02/17 08:40:0102/17/2016 08:40:01 2016-02-17 08:40:01.00000002016-02-17 08:40:01.0000000 2016-02-17 08:39:01.00000002016-02-17 08:39:01.0000000
PipelineSchedulerPipelineScheduler 2727 2222 2016/02/17 08:41:0102/17/2016 08:41:01 2016-02-17 08:41:00.00000002016-02-17 08:41:00.0000000 2016-02-17 08:40:01.00000002016-02-17 08:40:01.0000000

使用正则表达式标志的正则表达式模式Regex mode using regex flags

若要获取 resourceName 和 totalSlices,请使用以下查询:To get the resourceName and totalSlices, use the following query:

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices:long "," *
| project resourceName, totalSlices

具有不区分大小写的正则表达式标志的 parse-whereparse-where with case-insensitive regex flag

在上述查询中,默认模式为区分大小写,因此字符串已成功分析。In the above query, the default mode was case-sensitive, so the strings were parsed successfully. 未获得结果。No result was obtained.

若要获取所需结果,请运行包含不区分大小写的 (i) 正则表达式标志的 parse-whereTo get the required result, run parse-where with a case-insensitive (i) regex flag.

只会成功分析三个字符串,因此结果为三个记录(某些 totalSlices 包含无效整数)。Only three strings will be parsed successfully, so the result is three records (some totalSlices hold invalid integers).

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex flags=i EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices:long "," *
| project resourceName, totalSlices
resourceNameresourceName totalSlicestotalSlices
PipelineSchedulerPipelineScheduler 2727
PipelineSchedulerPipelineScheduler 2727
PipelineSchedulerPipelineScheduler 2727