Optimize queries that use named expressions

This article discusses how to optimize repeat use of named expressions in a query.

In Kusto Query Language, you can bind names to complex expressions in several different ways:

When you reference these named expressions in a query, the following steps occur:

  1. The calculation within the named expression is evaluated. This calculation produces either a scalar or tabular value.
  2. The named expression is replaced with the calculated value.

If the same bound name is used multiple times, then the underlying calculation will be repeated multiple times. When is this a concern?

  • When the calculations consume many resources and are used many times.
  • When the calculation is non-deterministic, but the query assumes all invocations to return the same value.

Mitigation

To mitigate these concerns, you can materialize the calculation results in memory during the query. Depending on the way the named calculation is defined, you'll use different materialization strategies:

Tabular functions

Use the following strategies for tabular functions:

  • let statements and function parameters: Use the materialize() function.
  • as operator: Set the hint.materialized hint value to true.

For example, the following query uses the non-deterministic tabular sample operator:

Note

Tables aren't sorted in general, so any table reference in a query is, by definition, non-deterministic.

Behavior without using the materialize function

range x from 1 to 100 step 1
| sample 1
| as T
| union T

Output

x
63
92

Behavior using the materialize function

range x from 1 to 100 step 1
| sample 1
| as hint.materialized=true T
| union T

Output

x
95
95

Scalar functions

Non-deterministic scalar functions can be forced to calculate exactly once by using toscalar().

For example, the following query uses the non-deterministic function, rand():

let x = () {rand(1000)};
let y = () {toscalar(rand(1000))};
print x, x, y, y

Output

print_0 print_1 print_2 print_3
166 137 70 70