Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Transformations in a data collection rule (DCR) allow you to filter or modify incoming data before it's stored in a Log Analytics workspace. This article describes how to build transformations in a DCR, including details and limitations of the Kusto Query Language (KQL) used for the transform statement.
Data transformations are defined using a Kusto Query Language (KQL) statement that is applied individually to each entry in the data source. It must understand the format of the incoming data and create output in the structure of the target table.
The input stream is represented by a virtual table named source
with columns matching the input data stream definition. Following is a typical example of a transformation. This example includes the following functionality:
- Filters the incoming data with a where statement
- Adds a new column using the extend operator
- Formats the output to match the columns of the target table using the project operator
source
| where severity == "Critical"
| extend Properties = parse_json(properties)
| project
TimeGenerated = todatetime(["time"]),
Category = category,
StatusDescription = StatusDescription,
EventName = name,
EventId = tostring(Properties.EventId)
Since the transformation is applied to each record individually, it can't use any KQL operators that act on multiple records. Only operators that take a single row as input and return no more than one row are supported. For example, summarize isn't supported since it summarizes multiple records. See Supported KQL features for a complete list of supported features.
The datatable operator isn't supported in the subset of KQL available to use in transformations. This would normally be used in KQL to define an inline query-time table. Use dynamic literals instead to work around this limitation.
For example, the following isn't supported in a transformation:
let galaxy = datatable (country:string,entity:string)['ES','Spain','US','United States'];
source
| join kind=inner (galaxy) on $left.Location == $right.country
| extend Galaxy_CF = ['entity']
You can instead use the following statement which is supported and performs the same functionality:
let galaxyDictionary = parsejson('{"ES": "Spain","US": "United States"}');
source
| extend Galaxy_CF = galaxyDictionary[Location]
Transformations don't currently support has. Use contains which is supported and performs similar functionality.
Consider the following input with dynamic data:
{
"TimeGenerated" : "2021-11-07T09:13:06.570354Z",
"Message": "Houston, we have a problem",
"AdditionalContext": {
"Level": 2,
"DeviceID": "apollo13"
}
}
In order to access the properties in AdditionalContext, define it as dynamic-typed column in the input stream:
"columns": [
{
"name": "TimeGenerated",
"type": "datetime"
},
{
"name": "Message",
"type": "string"
},
{
"name": "AdditionalContext",
"type": "dynamic"
}
]
The content of AdditionalContext column can now be parsed and used in the KQL transformation:
source
| extend parsedAdditionalContext = parse_json(AdditionalContext)
| extend Level = toint (parsedAdditionalContext.Level)
| extend DeviceId = tostring(parsedAdditionalContext.DeviceID)
Use the parse_json function to handle dynamic literals.
For example, the following queries provide the same functionality:
print d=dynamic({"a":123, "b":"hello", "c":[1,2,3], "d":{}})
print d=parse_json('{"a":123, "b":"hello", "c":[1,2,3], "d":{}}')
The right-hand side of let can be a scalar expression, a tabular expression or a user-defined function. Only user-defined functions with scalar arguments are supported.
The only supported data sources for the KQL statement are as follows:
- source, which represents the source data. For example:
source
| where ActivityId == "383112e4-a7a8-4b94-a701-4266dfc18e41"
| project PreciseTimeStamp, Message
- print operator, which always produces a single row. For example:
print x = 2 + 2, y = 5 | extend z = exp2(x) + exp2(y)
- extend
- project
- where
- parse
- project-away
- project-rename
- columnifexists (use columnifexists instead of column_ifexists)
All Numerical operators are supported.
All Datetime and Timespan arithmetic operators are supported.
The following String operators are supported.
- ==
- !=
- =~
- !~
- contains
- !contains
- contains_cs
- !contains_cs
- startswith
- !startswith
- startswith_cs
- !startswith_cs
- endswith
- !endswith
- endswith_cs
- !endswith_cs
- matches regex
- in
- !in
The following Bitwise operators are supported.
- binary_and()
- binary_or()
- binary_xor()
- binary_not()
- binary_shift_left()
- binary_shift_right()
- ago
- datetime_add
- datetime_diff
- datetime_part
- dayofmonth
- dayofweek
- dayofyear
- endofday
- endofmonth
- endofweek
- endofyear
- getmonth
- getyear
- hourofday
- make_datetime
- make_timespan
- now
- startofday
- startofmonth
- startofweek
- startofyear
- todatetime
- totimespan
- weekofyear
- base64_encodestring (use base64_encodestring instead of base64_encode_tostring)
- base64_decodestring (use base64_decodestring instead of base64_decode_tostring)
- countof
- extract
- extract_all
- indexof
- isempty
- isnotempty
- parse_json
- split
- strcat
- strcat_delim
- strlen
- substring
- tolower
- toupper
- hash_sha256
Use Identifier quoting as required.
- Create a data collection rule and an association to it from a virtual machine using the Azure Monitor agent.