parse 运算符
计算字符串表达式并将其值分析为一个或多个计算列。 已计算列将针对未成功分析的字符串返回 null
值。 如果不需要使用分析不成功的行,则优先使用 parse-where 运算符。
语法
T | parse
[ kind=
kind [ flags=
regexFlags ]] expression with
[ *
] stringConstant columnName [:
columnType] [ *
] ,
...
详细了解语法约定。
参数
客户 | 类型 | 必需 | 说明 |
---|---|---|---|
T | string |
✔️ | 要分析的表格输入。 |
kind | string |
✔️ | 支持的类型值之一。 默认值为 simple 。 |
regexFlags | string |
如果 kind 为 regex ,则可以指定要使用的正则表达式标志,例如 U 表示 ungreedy,m 表示多行模式,s 表示匹配新行 \n ,i 表示不区分大小写。 可以在标志中找到更多标志。 |
|
expression | string |
✔️ | 计算结果为字符串的表达式。 |
stringConstant | string |
✔️ | 要搜索和分析的字符串常量。 |
columnName | string |
✔️ | 要为其赋值的列的名称(从字符串表达式中提取)。 |
columnType | string |
一个标量值,指示要将值转换为何种类型。 默认为 string 。 |
注意
- 除了 StringConstant 之外,分析模式还可以以 ColumnName 开头。
- 在模式中使用
*
可跳过垃圾值。 不能在string
类型列后使用*
。 - 如果分析的表达式不是
string
类型,则会将其转换为string
类型。 - 如果还希望删除或重命名某些列,请使用
project
。
支持的类型值
文本 | 说明 |
---|---|
simple |
这是默认值。 stringConstant 是一个正则字符串值,匹配是严格的。 所有字符串分隔符都应出现在分析的字符串中,并且所有扩展列都必须与所需类型匹配。 |
regex |
stringConstant 可能是正则表达式,匹配要求较为严格。 所有字符串分隔符(对于此模式,可以是正则表达式)都应出现在分析的字符串中,并且所有扩展列都必须与所需类型匹配。 |
relaxed |
stringConstant 是一个正则字符串值,匹配是宽松的。 所有字符串分隔符都应出现在分析的字符串中,但是扩展列可能会部分匹配所需的类型。 与所需类型不匹配的扩展列将获得值 null 。 |
正则表达式模式
在正则表达式模式下,分析会将模式转换为正则表达式。 请使用正则表达式进行匹配,并使用在内部处理的带编号的已捕获组。 例如:
parse kind=regex Col with * <regex1> var1:string <regex2> var2:long
在分析语句中,分析内部生成的正则表达式是 .*?<regex1>(.*?)<regex2>(\-\d+)
。
*
转换为.*?
。string
转换为.*?
。long
转换为\-\d+
。
返回
输入表,根据提供给运算符的列的列表进行扩展。
示例
parse
运算符提供了一种简单的方法,可通过对同一 string
表达式使用多个 extract
应用程序来 extend
表。 当表中有一个 string
列,其中包含多个要分解为单独列的值时,此结果非常有用。 例如,由开发人员跟踪 ("printf
"/"Console.WriteLine
") 语句生成的列。
分析和扩展结果
在以下示例中,表 Traces
的列 EventText
包含 Event: NotifySliceRelease (resourceName={0}, totalSlices={1}, sliceNumber={2}, lockTime={3}, releaseTime={4}, previousLockTime={5})
形式的字符串。
该操作使用六列扩展表:resourceName
、totalSlices
、sliceNumber
、lockTime
、releaseTime
和 previousLockTime
。
let Traces = datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices: long * "sliceNumber=" sliceNumber: long * "lockTime=" lockTime ", releaseTime=" releaseTime: date "," * "previousLockTime=" previousLockTime: date ")" *
| project resourceName, totalSlices, sliceNumber, lockTime, releaseTime, previousLockTime
输出
resourceName | totalSlices | sliceNumber | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|---|
PipelineScheduler | 27 | 15 | 02/17/2016 08:40:00 | 2016-02-17 08:40:00.0000000 | 2016-02-17 08:39:00.0000000 |
PipelineScheduler | 27 | 23 | 02/17/2016 08:40:01 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 27 | 20 | 02/17/2016 08:40:01 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 27 | 16 | 02/17/2016 08:41:00 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:00.0000000 |
PipelineScheduler | 27 | 22 | 02/17/2016 08:41:01 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:01.0000000 |
提取电子邮件别名和 DNS
在以下示例中,将分析联系人表中的条目,以便从电子邮件地址中提取别名和域,并从网站 URL 中提取域。 查询会返回 EmailAddress
、EmailAlias
和 WebsiteDomain
列,其中 fullEmail
列会合并分析的电子邮件别名和域。
let Leads=datatable(Contacts: string)
[
"Event: LeadContact (email=john@contosohotel.com, Website=https:contosohotel.com)",
"Event: LeadContact (email=abi@fourthcoffee.com, Website=https:www.fourthcoffee.com)",
"Event: LeadContact (email=nevena@treyresearch.com, Website=https:treyresearch.com)",
"Event: LeadContact (email=faruk@tailspintoys.com, Website=https:tailspintoys.com)",
"Event: LeadContact (email=ebere@relecloud.com, Website=https:relecloud.com)",
];
Leads
| parse Contacts with * "email=" alias:string "@" domain: string ", Website=https:" WebsiteDomain: string ")"
| project EmailAddress=strcat(alias, "@", domain), EmailAlias=alias, WebsiteDomain
输出
EmailAddress | EmailAlias | WebsiteDomain |
---|---|---|
nevena@treyresearch.com | nevena | treyresearch.com |
john@contosohotel.com | john | contosohotel.com |
faruk@tailspintoys.com | faruk | tailspintoys.com |
ebere@relecloud.com | ebere | relecloud.com |
abi@fourthcoffee.com | abi | www.fourthcoffee.com |
正则表达式模式
在下面的示例中,正则表达式用于分析和提取 EventText
列中的数据。 提取的数据会投影到新字段中。
let Traces=datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse kind=regex EventText with "(.*?)[a-zA-Z]*=" resourceName @", totalSlices=\s*\d+\s*.*?sliceNumber=" sliceNumber: long ".*?(previous)?lockTime=" lockTime ".*?releaseTime=" releaseTime ".*?previousLockTime=" previousLockTime: date "\\)"
| project resourceName, sliceNumber, lockTime, releaseTime, previousLockTime
输出
resourceName | sliceNumber | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|
PipelineScheduler | 15 | 02/17/2016 08:40:00, | 02/17/2016 08:40:00, | 2016-02-17 08:39:00.0000000 |
PipelineScheduler | 23 | 02/17/2016 08:40:01, | 02/17/2016 08:40:01, | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 20 | 02/17/2016 08:40:01, | 02/17/2016 08:40:01, | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 16 | 02/17/2016 08:41:00, | 02/17/2016 08:41:00, | 2016-02-17 08:40:00.0000000 |
PipelineScheduler | 22 | 02/17/2016 08:41:01, | 02/17/2016 08:41:00, | 2016-02-17 08:40:01.0000000 |
使用正则表达式标志的正则表达式模式
在以下示例中,提取了 resourceName
。
let Traces=datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse kind=regex EventText with * "resourceName=" resourceName ',' *
| project resourceName
输出
resourceName |
---|
PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01 |
PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00 |
PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01 |
PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00 |
PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00 |
如果有些记录在 resourceName
中有时显示为小写,有时显示为大写,则某些值可能会获得 null。
上一示例中的结果是意外结果,并且包括完整的事件数据,因为默认模式是贪婪的。
要仅提取 resourceName
,请使用非贪婪 U
运行上一个查询,并禁用区分大小写的 i
正则表达式标志。
let Traces=datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse kind=regex flags=Ui EventText with * "RESOURCENAME=" resourceName ',' *
| project resourceName
输出
resourceName |
---|
PipelineScheduler |
PipelineScheduler |
PipelineScheduler |
PipelineScheduler |
PipelineScheduler |
如果分析的字符串具有换行符,请使用标志 s
来分析文本。
let Traces=datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler\ntotalSlices=27\nsliceNumber=23\nlockTime=02/17/2016 08:40:01\nreleaseTime=02/17/2016 08:40:01\npreviousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler\ntotalSlices=27\nsliceNumber=15\nlockTime=02/17/2016 08:40:00\nreleaseTime=02/17/2016 08:40:00\npreviousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler\ntotalSlices=27\nsliceNumber=20\nlockTime=02/17/2016 08:40:01\nreleaseTime=02/17/2016 08:40:01\npreviousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler\ntotalSlices=27\nsliceNumber=22\nlockTime=02/17/2016 08:41:01\nreleaseTime=02/17/2016 08:41:00\npreviousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler\ntotalSlices=27\nsliceNumber=16\nlockTime=02/17/2016 08:41:00\nreleaseTime=02/17/2016 08:41:00\npreviousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse kind=regex flags=s EventText with * "resourceName=" resourceName: string "(.*?)totalSlices=" totalSlices: long "(.*?)lockTime=" lockTime: datetime "(.*?)releaseTime=" releaseTime: datetime "(.*?)previousLockTime=" previousLockTime: datetime "\\)"
| project-away EventText
输出
resourceName | totalSlices | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|
PipelineScheduler |
27 | 2016-02-17 08:40:00.0000000 | 2016-02-17 08:40:00.0000000 | 2016-02-17 08:39:00.0000000 |
PipelineScheduler |
27 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler |
27 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler |
27 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:00.0000000 |
PipelineScheduler |
27 | 2016-02-17 08:41:01.0000000 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:01.0000000 |
Relaxed 模式
在以下相关模式示例中,扩展列 totalSlices
必须为类型 long
。 但是,在分析的字符串中,它的值是 nonValidLongValue
。
对于扩展列 releaseTime
,不能将 nonValidDateTime
值分析为 datetime
。
这两个扩展列会产生 null
值,而其他列(例如 sliceNumber
)仍会产生正确的值。
如果对以下查询使用选项 kind = simple
,则会获得所有扩展列的 null
结果。 此选项在扩展列上是严格的,这是 relaxed 模式和 simple 模式之间的差异。
注意
在 relaxed 模式下,可以对扩展列进行部分匹配。
let Traces=datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=23, lockTime=02/17/2016 08:40:01, releaseTime=nonValidDateTime 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=nonValidDateTime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=nonValidLongValue, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=nonValidDateTime 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=nonValidLongValue, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse kind=relaxed EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices: long ", sliceNumber=" sliceNumber: long * "lockTime=" lockTime ", releaseTime=" releaseTime: date "," * "previousLockTime=" previousLockTime: date ")" *
| project-away EventText
输出
resourceName | totalSlices | sliceNumber | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|---|
PipelineScheduler | 27 | 15 | 02/17/2016 08:40:00 | 2016-02-17 08:39:00.0000000 | |
PipelineScheduler | 27 | 23 | 02/17/2016 08:40:01 | 2016-02-17 08:39:01.0000000 | |
PipelineScheduler | 20 | 02/17/2016 08:40:01 | 2016-02-17 08:39:01.0000000 | ||
PipelineScheduler | 16 | 02/17/2016 08:41:00 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:00.0000000 | |
PipelineScheduler | 27 | 22 | 02/17/2016 08:41:01 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:01.0000000 |