Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
When it's necessary to break a result set into smaller sets of records for processing or because a result set would exceed the maximum allowed value of 1000 returned records, use paging. The REST API QueryResponse provides values that indicate a results set was broken up: resultTruncated and $skipToken. resultTruncated is a Boolean value that informs the consumer if there are more records not returned in the response. This condition can also be identified when the count property is less than the totalRecords property. totalRecords defines how many records that match the query.
resultTruncated is true when there are less resources available than a query is requesting or when paging is disabled or when paging isn't possible because:
- The query contains a
limitorsample/takeoperator. - All output columns are either
dynamicornulltype.
When resultTruncated is true, the $skipToken property isn't set.
The following examples show how to skip the first 3,000 records and return the first 1,000 records after those records skipped with Azure CLI and Azure PowerShell:
az graph query -q "Resources | project id, name | order by id asc" --first 1000 --skip 3000
Search-AzGraph -Query "Resources | project id, name | order by id asc" -First 1000 -Skip 3000
Important
The response won't contain $skipToken if:
- The query contains a
limitorsample/takeoperator. - All output columns are either
dynamicornulltype.
For an example, go to Next page query in the REST API docs.
Pagination limitations
Azure Resource Graph provides powerful capabilities for querying resources across your Azure environment. When working with large result sets that require pagination, understanding how pagination behaves in different scenarios helps you retrieve consistent and complete results.
This article explains pagination considerations and provides strategies for scenarios where you might observe duplicate or missing records in your paginated results.
Pagination limitations scenario: Sorting by non-unique columns
When paginating results sorted by a non-unique column, you might encounter duplicate or missing records even in static environments where resources aren't changing. This occurs because records with identical sort values have no guaranteed order, and their positions can shift between pagination calls.
Note
When using skip or first, it's recommended to order results by at least one column with asc or desc. Without sorting, results are random and not repeatable.
Why this scenario happens
Consider an example scenario, a query that retrieves virtual machines sorted by location:
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| order by location asc
| project name, location, resourceGroup
If multiple VMs share the same location value (for example, chinaeast2), their relative order isn't deterministic. When paginating:
Page 1 request: Retrieve the first five virtual machines.
| Position | Name | Location |
|---|---|---|
| 1 | vm-web-01 | chinaeast2 |
| 2 | vm-db-01 | chinaeast2 |
| 3 | vm-app-01 | chinaeast2 |
| 4 | vm-cache-01 | chinaeast2 |
| 5 | vm-api-01 | chinaeast2 |
Page 2 request: Skip five records and retrieve the next 5.
Because all these VMs share the same location value (chinaeast2), their relative order isn't guaranteed to remain consistent across pagination calls. The second page might return:
| Position | Name | Location |
|---|---|---|
| 1 | vm-app-01 | chinaeast2 |
| 2 | vm-queue-01 | chinaeast2 |
| 3 | vm-monitor-01 | chinanorth2 |
| 4 | vm-backup-01 | chinanorth2 |
| 5 | vm-test-01 | chinanorth2 |
Notice that vm-app-01 appears in both pages (duplicate). Due to the same reordering when there's lack of sorting, a record for e.g vm-db-01 might never appear in any subsequent page (missing).
Solution: sort by a unique column
Always include a unique column such as id in your sort order to ensure deterministic results:
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| order by location asc, id asc
| project name, location, resourceGroup
By adding id (which is unique for every resource) as a secondary sort column, you establish a stable, deterministic order that remains consistent across pagination calls.
Note
Sorting by a unique column resolves pagination inconsistencies in static environments where resources aren't frequently changing. If you're still experiencing duplicate or missing records after adding a unique sort column, your environment is likely dynamic with resources being created or deleted during pagination. See Scenario 2 for strategies to handle dynamic environments.
Pagination limitations scenario: pagination in dynamic environments
If you're experiencing duplicate or missing records despite sorting by a unique column, the cause is likely changes occurring in your Azure environment during pagination. When paginating through large result sets, changes to your Azure environment between requests can affect which records appear on each page.
Why this scenario happens
When resources change between pagination requests, the underlying data shifts. Consider the following example scenario:
Page 1 request: Retrieve the first 100 virtual machines sorted by id.
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| order by id asc
| take 100
This request returns VMs with IDs from vm-001 through vm-100.
Between requests: Resource vm-050 is deleted from your environment.
Page 2 request: Skip the first 100 records and retrieve the next 100.
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| order by id asc
| skip 100
| take 100
Because vm-050 was deleted, all subsequent resources shifted up by one position. The resource vm-101 is now at position 100, so when you skip 100 records, vm-101 isn't included in the results.
Similarly, if a new resource is created between requests, you might see duplicate records in your results.
Client-side strategies for dynamic environments
If your scenario requires a more consistent retrieval of resources, consider one of the following approaches. These strategies partition your data in a way that's resilient to changes and can also improve performance through parallel execution.
Note
These client-side strategies move the pagination logic to your application, which helps avoid the pagination inconsistencies described previously. However, they don't guarantee complete consistency across calls. Resources might be added or deleted between your initial query (for counting or retrieving IDs) and subsequent data fetches. This can result in discrepancies such as a mismatch between expected count and total resources fetched, or missing results if a resource was deleted during the operation. For scenarios requiring strict consistency, consider whether point-in-time accuracy is critical for your use case.
Option 1: Hash-based data partitioning
This approach partitions your data using a hash function to ensure consistent and non-overlapping results across multiple queries. Each resource belongs to exactly one partition based on its unique identifier.
Step 1: Get the total record count
First, determine how many records match your query:
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| count
Use the count to determine the number of partitions needed. For example, if your count returns 7,712 records and since Azure Resource Graph returns a maximum of 1,000 records per query, you would need at least eight partitions.
Step 2: Query each position
Use the hash() function to partition data based on the resource ID. Query each partition separately:
Partition 0
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id)) % 8 == 0
Partition 1
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id)) % 8 == 1
Continue for each partition through partition 7:
Partition 7
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id)) % 8 == 7
Pseudo code
Step 1: Get total count and calculate partitions
totalCount = executeQuery("Resources | where type =~ 'microsoft.compute/virtualmachines' | count")
numPartitions = ceiling(totalCount / 1000)
Step 2: Build queries for each partition
queries = []
for i = 0 to numPartitions - 1:
queries.append("Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id)) % {numPartitions} == {i}")
Step 3: Execute all queries in parallel and combine results
allResults = executeInParallel(queries)
Benefits
- No duplicates or missed records: Each resource ID hashes to exactly one partition.
- Parallel execution: All partition queries can run simultaneously, reducing total query time.
Option 2: Batch processing with resource IDs
This approach retrieves all resource IDs first, then queries for complete records in smaller batches. This ensures you have a consistent set of identifiers before retrieving the full resource data.
Step 1: Retrieve all resource IDs
Use summarize with make_set() to retrieve all resource IDs:
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| summarize make_set(id)
Step 2: Query in batches
Once you have the list of resource IDs, query for full records in batches of 1,000 or fewer:
Batch 1
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where id in~ ('id1', 'id2', ... , 'id1000')
Batch 2
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where id in~ ('id1001', 'id1002', ... , 'id2000')
Continue until all IDs are covered.
Benefits
- Guaranteed completeness: You have a fixed set of IDs before querying for details.
- Parallel execution: Batch queries can run simultaneously.
If the response of the query is huge (> 16 MB) and doesn't fit in a single call, it's suggested to use the previously mentioned partitioning technique to fetch all the data in multiple calls.
The following is an example of a query that might exceed response size limit of 16 MB:
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| summarize make_set(id)
In this case, use partitioning:
Partition 0
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id))%10 == 0
| summarize make_set(id)
Partition 1
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id))%10 == 1
| summarize make_set(id)
....
Partition 9
Resources
| where type =~ 'microsoft.compute/virtualmachines'
| where hash(tolower(id))%10 == 9
| summarize make_set(id)
Summary
From this article, you were able to learn:
- How sorting by non-unique columns can cause duplicate or missing records during pagination
- How dynamic environments with changing resources can affect paginated results
- Client-side strategies including hash-based partitioning and batch processing with resource IDs