parse-where operator

Evaluates a string expression, and parses its value into one or more calculated columns. The result is only the successfully parsed strings.

See parse operator, which produces nulls for unsuccessfully parsed strings.

T | parse-where Text with "ActivityName=" name ", ActivityType=" type

Syntax

T | parse-where [kind=regex [flags=regex_flags] |simple] Expression with * (StringConstant ColumnName [: ColumnType]) *...

Arguments

  • T: The input table.

  • kind:

    • simple (default): StringConstant is a regular string value, and the match is strict. All string delimiters should appear in the parsed string, and all extended columns must match the required types.

    • regex: StringConstant may be a regular expression, and the match is strict. All string delimiters should appear in the parsed string, and all extended columns must match the required types. String delimiters can be a regex for this mode.

    • flags: Flags to be used in regex mode: U (Ungreedy), m (multi-line mode), s (match new line \n), i (case-insensitive), More flags can be found in RE2 flags.

  • Expression: An expression that evaluates to a string.

  • ColumnName: The name of a column that is assigned to a value that was taken out of the string expression.

  • ColumnType: should be an optional scalar type that indicates the type to convert the value to. The default is string type.

Returns

The input table, which is extended according to the list of columns that are provided to the operator.

Note

Only successfully parsed strings will be in the output. Strings that don't match the pattern will be filtered out.

Tips

  • parse-where parses the strings in the same way as parse, and filters out strings that were not parsed successfully.

  • Use project if you also want to drop or rename some columns.

  • Use * in the pattern to skip junk values. This value can't be used after string column.

  • The parse pattern may start with ColumnName, in addition to StringConstant.

  • If the parsed Expression isn't of type string, it will be converted to type string.

  • If regex mode is used, you can add regex flags to control the whole regex used in parse.

  • In regex mode, parse will translate the pattern to a regex and use RE2 syntax in order to do the matching using numbered captured groups that are handled internally.

    For example, this parse statement:

    parse-where kind=regex Col with * <regex1> var1:string <regex2> var2:long
    

    The regex that will be generated by the parse internally is .*?<regex1>(.*?)<regex2>(\-\d+).

    • * was translated to .*?.

    • string was translated to .*?.

    • long was translated to \-\d+.

Examples

The parse-where operator provides a streamlined way to extend a table by using multiple extract applications on the same string expression. This is most useful when the table has a string column that contains several values that you want to break into individual columns. For example, you can break up a column that was produced by a developer trace ("printf"/"Console.WriteLine") statement.

Using parse

In the example below, the column EventText of table Traces contains strings of the form Event: NotifySliceRelease (resourceName={0}, totalSlices= {1}, sliceNumber={2}, lockTime={3}, releaseTime={4}, previousLockTime={5}). The operation below will extend the table with six columns: resourceName , totalSlices, sliceNumber, lockTime , releaseTime, previousLockTime, Month, and Day.

A few of the strings don't have a full match.

Using parse, the calculated columns will have nulls.

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces  
| parse EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices:long * "sliceNumber=" sliceNumber:long * "lockTime=" lockTime ", releaseTime=" releaseTime:date "," * "previousLockTime=" previouLockTime:date ")" *  
| project resourceName ,totalSlices , sliceNumber , lockTime , releaseTime , previousLockTime
resourceName totalSlices sliceNumber lockTime releaseTime previousLockTime
PipelineScheduler 27 20 02/17/2016 08:40:01 2016-02-17 08:40:01.0000000 2016-02-17 08:39:01.0000000
PipelineScheduler 27 22 02/17/2016 08:41:01 2016-02-17 08:41:00.0000000 2016-02-17 08:40:01.0000000

Using parse-where

Using 'parse-where' will filter-out unsuccessfully parsed strings from the result.

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces  
| parse-where EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices:long * "sliceNumber=" sliceNumber:long * "lockTime=" lockTime ", releaseTime=" releaseTime:date "," * "previousLockTime=" previousLockTime:date ")" *  
| project resourceName ,totalSlices , sliceNumber , lockTime , releaseTime , previousLockTime
resourceName totalSlices sliceNumber lockTime releaseTime previousLockTime
PipelineScheduler 27 20 02/17/2016 08:40:01 2016-02-17 08:40:01.0000000 2016-02-17 08:39:01.0000000
PipelineScheduler 27 22 02/17/2016 08:41:01 2016-02-17 08:41:00.0000000 2016-02-17 08:40:01.0000000

Regex mode using regex flags

To get the resourceName and totalSlices, use the following query:

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices:long "," *
| project resourceName, totalSlices

parse-where with case-insensitive regex flag

In the above query, the default mode was case-sensitive, so the strings were parsed successfully. No result was obtained.

To get the required result, run parse-where with a case-insensitive (i) regex flag.

Only three strings will be parsed successfully, so the result is three records (some totalSlices hold invalid integers).

let Traces = datatable(EventText:string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex flags=i EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices:long "," *
| project resourceName, totalSlices
resourceName totalSlices
PipelineScheduler 27
PipelineScheduler 27
PipelineScheduler 27