Graph Query Language (GQL) reference (preview)

Applies to: ✅ Azure Data Explorer

Note

GQL support is in preview. Features and syntax can change based on feedback and ongoing development.

Introduction

This reference covers the fundamental concepts, functions, and operators of Graph Query Language (GQL). Graph Query Language (GQL) is built on mathematical graph theory concepts that provide a solid foundation for querying graph data. Understanding these fundamentals helps you write more effective queries and better understand how GQL processes your data. GQL also provides a rich set of functions and operators to work with graph patterns, nodes, edges, and properties.

Fundamental concepts

This section covers the core concepts that form the foundation of graph data analysis with GQL.

Graph patterns

Graph patterns are the core building blocks of GQL queries. They describe the structure you want to find in your graph data using a declarative syntax that mirrors the visual representation of graphs.

Node patterns

Node patterns specify how to match individual nodes in your graph:

(n)              -- Any node
(n:Person)       -- Node with Person label  
(n:Person&City)  -- Node with Person AND City label
(:Person)        -- Person node, don't bind variable

Key concepts:

  • Variable binding: (n) creates a variable n that you can reference later in the query
  • Anonymous nodes: (:Person) matches nodes without creating a variable
  • Label filtering: :Person restricts matches to nodes with the Person label
  • Label combinations: Use & for AND, | for OR operations

Edge patterns

Edge patterns define how nodes connect to each other:

-[e]->                -- Directed outgoing edge, any label
-[e:works_at]->       -- Directed edge, works_at label
-[e:knows|likes]->    -- knows OR likes edge
<-[e]-                -- Directed incoming edge
-[e]-                 -- Undirected (any direction)

Key concepts:

  • Direction: -> for outgoing, <- for incoming, - for any direction
  • Edge types: Use labels like :works_at to filter by relationship type
  • Multiple types: knows|likes matches either relationship type

Label expressions

Labels provide semantic meaning to nodes and edges. GQL supports complex label expressions:

:Person&Company                  -- Both Person AND Company labels
:Person|Company                  -- Person OR Company labels
:!Company                        -- NOT Company label
:(Person|!Company)&City          -- Complex expressions with parentheses

Operators:

  • & (AND): Node must have all specified labels
  • | (OR): Node must have at least one specified label
  • ! (NOT): Node must not have the specified label
  • (): Parentheses for grouping complex expressions

Path patterns

Path patterns describe multi-hop relationships in your graph:

(a)-[e1]->(b)-[e2]->(c)              -- 2-hop path
(a)-[e]->{2,4}(b)                    -- 2 to 4 hops
(a)-[e]->{1,}(b)                     -- 1 or more hops
(a)-[:knows|likes]->{1,3}(b)         -- 1-3 hops via knows/likes
p=()-[:works_at]->()                 -- Binding a path variable

Variable-length paths:

  • {2,4}: Exactly 2 to 4 hops
  • {1,}: 1 or more hops (unbounded)
  • {,3}: Up to 3 hops
  • {5}: Exactly 5 hops

Path variables:

  • p=()->(): Captures the entire path for later analysis
  • Access with NODES(p), RELATIONSHIPS(p), PATH_LENGTH(p)

Multiple patterns

GQL supports complex, non-linear graph structures:

(a)->(b), (a)->(c)               -- Multiple edges from same node
(a)->(b)<-(c), (b)->(d)          -- Non-linear structures

Pattern composition:

  • Use commas , to separate multiple patterns
  • All patterns must match simultaneously
  • Variables can be shared across patterns

Match modes

GQL supports different path matching modes that control how patterns are matched against graph data. These modes affect performance, result completeness, and the types of paths that are returned.

Match modes control how graph elements can be reused across pattern variables within a single MATCH clause.

DIFFERENT EDGES (default)

The default mode in Kusto. A matched edge cannot bind to more than one edge variable, but nodes can be reused freely.

MATCH (a)-[r1]->(b)-[r2]->(c)
-- r1 and r2 must be different edges
-- a, b, c can be the same or different nodes

REPEATABLE ELEMENTS

Allows both edges and nodes to be reused across pattern variables without restrictions.

MATCH REPEATABLE ELEMENTS (a)-[r1]->(b)-[r2]->(c)
-- r1 and r2 can be the same edge
-- a, b, c can be the same or different nodes

Path modes

Path modes control which types of paths are included in results based on repetition constraints.

WALK (default)

The default GQL path mode. Includes all possible paths with no restrictions on node or edge repetition.

MATCH WALK (a)-[]->{1,3}(b)
-- Allows paths with repeating nodes and edges

TRAIL

Filters out paths that have repeating edges. Nodes may repeat, but each edge can only appear once per path.

MATCH TRAIL (a)-[]->{1,3}(b)
-- No edge can appear twice in the same path
-- Nodes may repeat

ACYCLIC

Filters out paths that have repeating nodes. Each node can only appear once per path.

MATCH ACYCLIC (a)-[]->{1,3}(b)
-- No node can appear twice in the same path
-- Prevents cycles entirely

SIMPLE

Same as ACYCLIC but allows the first and last nodes in the path to be the same (forming a simple cycle).

MATCH SIMPLE (a)-[]->{1,3}(b)
-- No node repetition except first/last can match
-- Currently not supported in implementation

Mode combinations and Kusto cycles parameter

Different combinations of match modes and path modes map to specific Kusto cycles parameter values:

DIFFERENT EDGES mode

Path Mode Single Path Multi-path ("star" pattern)
WALK cycles=unique_edges cycles=unique_edges (only if all paths are WALK/TRAIL)
TRAIL cycles=unique_edges cycles=unique_edges (only if all paths are WALK/TRAIL)
ACYCLIC cycles=none Not supported
SIMPLE Not supported Not supported

REPEATABLE ELEMENTS mode

Path Mode Single Path Multi-path ("star" pattern)
WALK cycles=all cycles=all (only if all paths are WALK)
TRAIL cycles=unique_edges Not supported
ACYCLIC cycles=none Not supported
SIMPLE Not supported Not supported

Syntax examples

Basic path modes
-- Default (WALK with DIFFERENT EDGES)
MATCH (n)-[]->(m)
RETURN n, m

-- Explicit WALK mode
MATCH WALK (n)-[]->(m)
RETURN n, m

-- TRAIL mode - no repeating edges
MATCH TRAIL (n)-[]->{1,3}(m)
RETURN n, m

-- ACYCLIC mode - no repeating nodes
MATCH ACYCLIC (n)-[]->{1,3}(m)
RETURN n, m
Match mode combinations
-- DIFFERENT EDGES with WALK (default)
MATCH DIFFERENT EDGES WALK (n)-[]->(m)
RETURN n, m

-- REPEATABLE ELEMENTS with WALK
MATCH REPEATABLE ELEMENTS WALK (n)-[]->(m)
RETURN n, m

-- REPEATABLE ELEMENTS with TRAIL
MATCH REPEATABLE ELEMENTS TRAIL (n)-[]->(m)
RETURN n, m
Multi-pattern queries
-- Multiple WALK patterns (star pattern)
MATCH WALK (n)-[]->(a), WALK (n)-[]->(b)
RETURN n, a, b

-- Mixed path modes (both must be WALK/TRAIL for multi-path)
MATCH WALK (n)-[]->(a), TRAIL (n)-[]->(b)
RETURN n, a, b

Functions and operators reference

Graph Query Language (GQL) provides a rich set of functions and operators to work with graph patterns, nodes, edges, and properties.

Core GQL functions and operators

This table lists the core GQL functions and operators, along with their Kusto Query Language (KQL) equivalents and examples.

GQL Function/Operator Description GQL Example
MATCH Find graph patterns MATCH (a)-[r]->(b)
OPTIONAL MATCH Find patterns that might not exist OPTIONAL MATCH (p)->(c:City)
WHERE Filter patterns and properties WHERE person.age > 25
FILTER Equivalent to WHERE but used without MATCH clauses FILTER p.name = 'Carol' OR c.name = 'Seattle'
IS NULL Check for null values WHERE person.age IS NULL
IS NOT NULL Check for non-null values WHERE person.age IS NOT NULL
RETURN Project results RETURN person.name, person.age
DISTINCT Return unique values RETURN DISTINCT person.name
COUNT(*) Count all rows RETURN COUNT(*)
COUNT() Count non-null values RETURN COUNT(person.name)
SUM() Sum numeric values RETURN SUM(person.age)
MIN() Minimum value RETURN MIN(person.age)
MAX() Maximum value RETURN MAX(person.age)
AVG() Average value RETURN AVG(person.age)
COLLECT_LIST() Collect values into array RETURN COLLECT_LIST(person.name)
SIZE() Array length RETURN SIZE(COLLECT_LIST(n.firstName))
Labels() Show labels for a node or edge RETURN labels(entity)
UPPER() Convert to uppercase RETURN UPPER(person.name)
LOWER() Convert to lowercase RETURN LOWER(person.name)
LEFT() Extract left substring WHERE LEFT(person.name, 3) = 'Tom'
RIGHT() Extract right substring WHERE RIGHT(person.name, 5) = 'Hanks'
STARTS WITH String starts with pattern WHERE person.name STARTS WITH 'Tom'
ENDS WITH String ends with pattern WHERE person.name ENDS WITH 'Hanks'
CONTAINS String contains pattern WHERE person.name CONTAINS 'Tom'
\|\| String concatenation RETURN n.firstName \|\| ' ' \|\| n.lastName
TRIM() Remove whitespace from both ends RETURN TRIM(' abc ')
BTRIM() Remove whitespace from both ends (alias for TRIM) RETURN BTRIM(' abc ')
LTRIM() Remove whitespace from left end RETURN LTRIM(' abc ')
RTRIM() Remove whitespace from right end RETURN RTRIM(' abc ')
TRIM(BOTH FROM) Remove whitespace from both ends (explicit) RETURN TRIM(BOTH FROM ' abc ')
TRIM(LEADING FROM) Remove whitespace from left end (explicit) RETURN TRIM(LEADING FROM ' abc')
TRIM(TRAILING FROM) Remove whitespace from right end (explicit) RETURN TRIM(TRAILING FROM 'abc ')
TRIM(chars FROM) Remove specified characters from both ends RETURN TRIM('_' FROM '_abc_')
TRIM(BOTH chars FROM) Remove specified characters from both ends (explicit) RETURN TRIM(BOTH '_' FROM '_abc_')
TRIM(LEADING chars FROM) Remove specified characters from left end RETURN TRIM(LEADING 'd' FROM 'dddabc')
TRIM(TRAILING chars FROM) Remove specified characters from right end RETURN TRIM(TRAILING 't' FROM 'abcttt')
STRING_JOIN() Join array elements with delimiter RETURN STRING_JOIN(["a", "b" \|\| "c"], "-")
CAST() Convert data types CAST(person.age AS STRING)
ZONED_DATETIME() Create datetime from string ZONED_DATETIME('2024-01-01')
CURRENT_TIMESTAMP Current timestamp WHERE created < CURRENT_TIMESTAMP
DURATION() Create timespan from time units (days to nanoseconds) DURATION({days: 3, hours: 2})
DURATION_BETWEEN() Calculate duration between two datetime values DURATION_BETWEEN(start_time, end_time)
NODES() Extract nodes from a path RETURN NODES(path_variable)
RELATIONSHIPS() Extract edges from a path RETURN RELATIONSHIPS(path_variable)
PATH_LENGTH() Get the length of a path RETURN PATH_LENGTH(path_variable)
ORDER BY Sort results ORDER BY person.age DESC
LIMIT Limit result count LIMIT 10
& (AND) Label intersection MATCH (p:Person & Male)
\| (OR) Label union MATCH (n:Person \| Movie)
! (NOT) Label negation MATCH (p:!Female)

Best practices

  • KQL supports dynamic types, but GQL does not clearly define how these should be handled. To avoid runtime errors, explicitly cast nested fields to their expected type (see CAST).

  • Because GQL runs on KQL, some operations inherit KQL semantics. For instance, COLLECT_LIST (the GQL equivalent of make_list) will flatten arrays if the field is already an array. If results differ from expectations, consult the KQL documentation for the equivalent function.

Performance optimization

Use these strategies to optimize GQL query performance in production environments:

Tip

Start with simple patterns, then increase complexity if needed. Monitor query performance, and adjust path lengths and filters to improve results.

Limit path matching scope:

  • Use specific label filters to reduce the search space: MATCH (start:SpecificType) instead of MATCH (start)
  • Limit variable length paths with reasonable bounds: MATCH (a)-[]->{1,3}(b) instead of unbounded paths
  • Apply WHERE clauses early to filter results before expensive operations.

Use COUNT(*) for existence checks:

If you only need to check if a pattern exists, use COUNT(*) instead of returning full results.

MATCH (user:User)-[:SUSPICIOUS_ACTIVITY]->(target)
WHERE user.id = 'user123'
RETURN COUNT(*) > 0 AS HasSuspiciousActivity

Limitations

  • Reserved keywords: Some GQL keywords can't be used as identifiers in queries. Some reserved keywords aren't immediately obvious (for example, DATE is a reserved keyword). If your graph data has property names that conflict with GQL reserved keywords, use different property names in your graph schema or rename them to avoid parsing conflicts.

    Important

    When you design your graph schema, some common property names might conflict with GQL reserved keywords. Avoid or rename these property names.

  • No INSERT/CREATE support: Operations to change graph structures are not supported. Instead, use KQL for all graph creation, change, and management tasks.

  • Optional matches: Supported only for node patterns (not edges).

  • Entity equivalence checks not supported: GQL's(MATCH (n)-[]-(n2) WHERE n1 <> n2) is not supported. Use explicit field comparisons instead, for example, n.id <> n2.id.

  • Time and timezone: The engine operates in UTC. Datetime literals must use zoned datetime; only the UTC zone is supported via ZONED_DATETIME("2011-12-31 23:59:59.9").

  • Duration granularity: Durations support up to days and smaller units down to nanoseconds. Larger-than-day units (for example, weeks, months, years) aren't supported.

  • Traversal modes: GQL defines configurable traversal modes for matching and paths. For MATCH, the modes are 'DIFFERENT EDGES' and 'REPEATABLE EDGES'; for PATH, the modes are 'WALK', 'TRAIL', 'ACYCLIC', and 'SIMPLE'. The current implementation defaults to 'DIFFERENT EDGES' and 'WALK', respectively. Some combinations are not supported.

Labels() custom GQL function

The labels() function shows the labels for a node or edge as an array.

Syntax:

labels(entity`)

Parameters:

  • entity: A node or edge variable from a matched pattern.

Returns:

Returns an array of strings with all labels for the specified entity.

Examples:

Show labels for matched nodes:

MATCH (entity)
RETURN entity.name, labels(entity)

Output

This query shows the name and all labels for each node in the graph.

entity.name labels(entity)
john.doe ["User"]
admin.user ["User"]
web-server ["System"]
database ["System"]
domain-controller ["System"]

Show labels in projections with aliases:

MATCH (n)-[e]->(target)
RETURN n.name, labels(n) AS n_labels, labels(e) AS edge_labels, target.name

This query shows node names, their labels, and the labels of connecting edges.

n.name n_labels edge_labels target.name
john.doe ["User"] ["CAN_ACCESS"] web-server
admin.user ["User"] ["CAN_ACCESS"] domain-controller
web-server ["System"] ["CAN_ACCESS"] database
domain-controller ["System"] ["CAN_ACCESS"] database