.list blobs command

Applies to: ✅ Azure Data Explorer

The .list blobs command lists blobs under a specified container path.

This command is typically used with .ingest-from-storage-queued to ingest data. You can also use it on its own to better understand folder contents and parameterize ingestion commands.

Note

Queued ingestion commands are run on the data ingestion URI endpoint https://ingest-<YourClusterName><Region>.kusto.chinacloudapi.cn.
When the command is used on its own, the results might be truncated to limit the number of blobs. However, when used with .ingest-from-storage-queued, no truncation is applied.

Permissions

You must have at least Table Ingestor permissions to run this command.

Syntax

.list blobs (SourceDataLocators) [Suffix=SuffixValue] [MaxFiles=MaxFilesValue] [PathFormat=PathFormatValue]

Learn more about syntax conventions.

Parameters

Name	Type	Required	Description
SourceDataLocators	`string`	✔️	One or many storage connection strings separated by a comma character. Each connection string can refer to a storage container or a file prefix within a container. Currently, only one storage connection string is supported.
SuffixValue	`string`		The suffix that enables blob filtering.
MaxFilesValue	`integer`		The maximum number of blobs to return.
PathFormatValue	`string`		The pattern in the blob's path that can be used to retrieve the creation time as an output field. For more information, see Path format.

Note

We recommend using obfuscated string literals for SourceDataLocators to scrub credentials in internal traces and error messages.
When used alone, .list blob returns up to 1,000 files, regardless of any larger value specified in MaxFiles.
The primary use of .list blobs is for queued ingestion which is done asynchronously with no user context. Therefore, Impersonation isn't supported.

Ingestion properties

Important

In queued ingestion data is batched using Ingestion properties. The more distinct ingestion mapping properties used, such as different ConstValue values, the more fragmented the ingestion becomes, which can lead to performance degradation.

The following table lists and describes the supported properties, and provides examples:

Property	Description	Example
`ingestionMapping`	A string value that indicates how to map data from the source file to the actual columns in the table. Define the `format` value with the relevant mapping type. See data mappings.	`with (format="json", ingestionMapping = "[{\"column\":\"rownumber\", \"Properties\":{\"Path\":\"$.RowNumber\"}}, {\"column\":\"rowguid\", \"Properties\":{\"Path\":\"$.RowGuid\"}}]")` (deprecated: `avroMapping`, `csvMapping`, `jsonMapping`)
`ingestionMappingReference`	A string value that indicates how to map data from the source file to the actual columns in the table using a named mapping policy object. Define the `format` value with the relevant mapping type. See data mappings.	`with (format="csv", ingestionMappingReference = "Mapping1")` (deprecated: `avroMappingReference`, `csvMappingReference`, `jsonMappingReference`)
`creationTime`	The datetime value (formatted as an ISO8601 string) to use at the creation time of the ingested data extents. If unspecified, the current value (`now()`) is used. Overriding the default is useful when ingesting older data, so that the retention policy is applied correctly. When specified, make sure the `Lookback` property in the target table's effective Extents merge policy is aligned with the specified value.	`with (creationTime="2017-02-13")`
`extend_schema`	A Boolean value that, if specified, instructs the command to extend the schema of the table (defaults to `false`). This option applies only to `.append` and `.set-or-append` commands. The only allowed schema extensions have more columns added to the table at the end.	If the original table schema is `(a:string, b:int)`, a valid schema extension would be `(a:string, b:int, c:datetime, d:string)`, but `(a:string, c:datetime)` wouldn't be valid
`folder`	For ingest-from-query commands, the folder to assign to the table. If the table already exists, this property overrides the table's folder.	`with (folder="Tables/Temporary")`
`format`	The data format (see supported data formats).	`with (format="csv")`
`ingestIfNotExists`	A string value that, if specified, prevents ingestion from succeeding if the table already has data tagged with an `ingest-by:` tag with the same value. This ensures idempotent data ingestion. For more information, see ingest-by: tags.	The properties `with (ingestIfNotExists='["Part0001"]', tags='["ingest-by:Part0001"]')` indicate that if data with the tag `ingest-by:Part0001` already exists, then don't complete the current ingestion. If it doesn't already exist, this new ingestion should have this tag set (in case a future ingestion attempts to ingest the same data again.)
`ignoreFirstRecord`	A Boolean value that, if set to `true`, indicates that ingestion should ignore the first record of every file. This property is useful for files in `CSV`and similar formats, if the first record in the file are the column names. By default, `false` is assumed.	`with (ignoreFirstRecord=false)`
`policy_ingestiontime`	A Boolean value that, if specified, describes whether to enable the Ingestion Time Policy on a table that is created by this command. The default is `true`.	`with (policy_ingestiontime=false)`
`recreate_schema`	A Boolean value that, if specified, describes whether the command may recreate the schema of the table. This property applies only to the `.set-or-replace` command. This property takes precedence over the `extend_schema` property if both are set.	`with (recreate_schema=true)`
`tags`	A list of tags to associate with the ingested data, formatted as a JSON string	`with (tags="['Tag1', 'Tag2']")`
`TreatGzAsUncompressed`	A Boolean value that, if set to `true`, indicates that files with the extension `.gz` are not compressed. This flag is sometimes needed when ingesting from Amazon AWS S3.	`with (treatGzAsUncompressed=true)`
`validationPolicy`	A JSON string that indicates which validations to run during ingestion of data represented using CSV format. See Data ingestion for an explanation of the different options.	`with (validationPolicy='{"ValidationOptions":1, "ValidationImplications":1}')` (this is the default policy)
`zipPattern`	Use this property when ingesting data from storage that has a ZIP archive. This is a string value indicating the regular expression to use when selecting which files in the ZIP archive to ingest. All other files in the archive are ignored.	`with (zipPattern="*.csv")`

Path format

The PathFormat parameter allows you to specify the format of the creation time for listed blobs. It consists of a sequence of text separators and partition elements. A partition element refers to a partition that is declared in the partition by clause, and the text separator is any text enclosed in quotes. Consecutive partition elements must be set apart using the text separator.

[ StringSeparator ] Partition [ StringSeparator ] [Partition [ StringSeparator ] ...]

To construct the original file path prefix, partition elements are rendered as strings and separated with corresponding text separators. You can use the datetime_pattern macro (datetime_pattern(DateTimeFormat, PartitionName)) to specify the format used for rendering a datetime partition value. The macro adheres to the .NET format specification, and allows format specifiers to be enclosed in curly brackets. For example, the following two formats are equivalent:

'year='yyyy'/month='MM
year={yyyy}/month={MM}

By default, datetime values are rendered using the following formats:

Partition function	Default format
`startofyear`	`yyyy`
`startofmonth`	`yyyy/MM`
`startofweek`	`yyyy/MM/dd`
`startofday`	`yyyy/MM/dd`
`bin(`Column`, 1d)`	`yyyy/MM/dd`
`bin(`Column`, 1h)`	`yyyy/MM/dd/HH`
`bin(`Column`, 1m)`	`yyyy/MM/dd/HH/mm`

Returns

The result of the command is a table with one record per blob listed.

Name	Type	Description
BlobUri	`string`	The URI of the blob.
SizeInBytes	`long`	The number of bytes, or content-length, of the blob.
CapturedVariables	`dynamic`	The captured variables. Currently only `CreationTime` is supported.

Examples

List maximum number of blobs

The following command lists a maximum of 20 blobs from the myfolder folder using system-assigned managed identity authentication.

.list blobs (
    "https://mystorageaccount.blob.core.chinacloudapi.cn/datasets/myfolder;managed_identity=system"
)
MaxFiles=20

List Parquet blobs

The following command lists a maximum of 10 blobs of type .parquet from a folder, using system-assigned managed identity authentication.

.list blobs (
    "https://mystorageaccount.blob.core.chinacloudapi.cn/datasets/myfolder;managed_identity=system"
)
Suffix=".parquet"
MaxFiles=10

Capture date from blob path

The following command lists a maximum of 10 blobs of type .parquet from a folder, using system-assigned managed identity authentication, and extracts the date from the URL path.

.list blobs (
    "https://mystorageaccount.blob.core.chinacloudapi.cn/datasets/myfolder;managed_identity=system"
)
Suffix=".parquet"
MaxFiles=10
PathFormat=("myfolder/year=" datetime_pattern("yyyy'/month='MM'/day='dd", creationTime) "/")

The PathFormat in the example can extract dates from a path such as the following path:

https://mystorageaccount.blob.core.chinacloudapi.cn/datasets/myfolder/year=2024/month=03/day=16/myblob.parquet

Last updated on 2026-02-06