List blobs with Python
This article shows how to list blobs using the Azure Storage client library for Python.
To learn about listing blobs using asynchronous APIs, see List blobs asynchronously.
Prerequisites
- Azure subscription - create one for trial
- Azure storage account - create a storage account
- Python 3.8+
Set up your environment
If you don't have an existing project, this section shows you how to set up a project to work with the Azure Blob Storage client library for Python. For more details, see Get started with Azure Blob Storage and Python.
To work with the code examples in this article, follow these steps to set up your project.
Install packages
Install the following packages using pip install
:
pip install azure-storage-blob azure-identity
Add import statements
Add the following import
statements:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient, ContainerClient, BlobPrefix
Authorization
The authorization mechanism must have the necessary permissions to list a blob. For authorization with Microsoft Entra ID (recommended), you need Azure RBAC built-in role Storage Blob Data Reader or higher. To learn more, see the authorization guidance for List Blobs (REST API).
Create a client object
To connect an app to Blob Storage, create an instance of BlobServiceClient. The following example shows how to create a client object using DefaultAzureCredential
for authorization:
# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.chinacloudapi.cn"
credential = DefaultAzureCredential()
# Create the BlobServiceClient object
blob_service_client = BlobServiceClient(account_url, credential=credential)
You can also create client objects for specific containers or blobs, either directly or from the BlobServiceClient
object. To learn more about creating and managing client objects, see Create and manage client objects that interact with data resources.
About blob listing options
When you list blobs from your code, you can specify many options to manage how results are returned from Azure Storage. You can specify the number of results to return in each set of results, and then retrieve the subsequent sets. You can specify a prefix to return blobs whose names begin with that character or string. And you can list blobs in a flat listing structure, or hierarchically. A hierarchical listing returns blobs as though they were organized into folders.
To list the blobs in a container using a flat listing, call one of these methods:
- ContainerClient.list_blobs (along with the name, you can optionally include metadata, tags, and other information associated with each blob)
- ContainerClient.list_blob_names (only returns blob name)
To list the blobs in a container using a hierarchical listing, call the following method:
- ContainerClient.walk_blobs (along with the name, you can optionally include metadata, tags, and other information associated with each blob)
Filter results with a prefix
To filter the list of blobs, specify a string for the name_starts_with
keyword argument. The prefix string can include one or more characters. Azure Storage then returns only the blobs whose names start with that prefix.
Flat listing versus hierarchical listing
Blobs in Azure Storage are organized in a flat paradigm, rather than a hierarchical paradigm (like a classic file system). However, you can organize blobs into virtual directories in order to mimic a folder structure. A virtual directory forms part of the name of the blob and is indicated by the delimiter character.
To organize blobs into virtual directories, use a delimiter character in the blob name. The default delimiter character is a forward slash (/), but you can specify any character as the delimiter.
If you name your blobs using a delimiter, then you can choose to list blobs hierarchically. For a hierarchical listing operation, Azure Storage returns any virtual directories and blobs beneath the parent object. You can call the listing operation recursively to traverse the hierarchy, similar to how you would traverse a classic file system programmatically.
Use a flat listing
By default, a listing operation returns blobs in a flat listing. In a flat listing, blobs aren't organized by virtual directory.
The following example lists the blobs in the specified container using a flat listing:
def list_blobs_flat(self, blob_service_client: BlobServiceClient, container_name):
container_client = blob_service_client.get_container_client(container=container_name)
blob_list = container_client.list_blobs()
for blob in blob_list:
print(f"Name: {blob.name}")
Sample output is similar to:
List blobs flat:
Name: file4.txt
Name: folderA/file1.txt
Name: folderA/file2.txt
Name: folderA/folderB/file3.txt
You can also specify options to filter list results or show additional information. The following example lists blobs and blob tags:
def list_blobs_flat_options(self, blob_service_client: BlobServiceClient, container_name):
container_client = blob_service_client.get_container_client(container=container_name)
blob_list = container_client.list_blobs(include=['tags'])
for blob in blob_list:
print(f"Name: {blob['name']}, Tags: {blob['tags']}")
Sample output is similar to:
List blobs flat:
Name: file4.txt, Tags: None
Name: folderA/file1.txt, Tags: None
Name: folderA/file2.txt, Tags: None
Name: folderA/folderB/file3.txt, Tags: {'tag1': 'value1', 'tag2': 'value2'}
Note
The sample output shown assumes that you have a storage account with a flat namespace. If you've enabled the hierarchical namespace feature for your storage account, directories are not virtual. Instead, they are concrete, independent objects. As a result, directories appear in the list as zero-length blobs.
For an alternative listing option when working with a hierarchical namespace, see List directory contents (Azure Data Lake Storage Gen2).
Use a hierarchical listing
When you call a listing operation hierarchically, Azure Storage returns the virtual directories and blobs at the first level of the hierarchy.
To list blobs hierarchically, use the following method:
The following example lists the blobs in the specified container using a hierarchical listing:
depth = 0
indent = " "
def list_blobs_hierarchical(self, container_client: ContainerClient, prefix):
for blob in container_client.walk_blobs(name_starts_with=prefix, delimiter='/'):
if isinstance(blob, BlobPrefix):
# Indentation is only added to show nesting in the output
print(f"{self.indent * self.depth}{blob.name}")
self.depth += 1
self.list_blobs_hierarchical(container_client, prefix=blob.name)
self.depth -= 1
else:
print(f"{self.indent * self.depth}{blob.name}")
Sample output is similar to:
folderA/
folderA/folderB/
folderA/folderB/file3.txt
folderA/file1.txt
folderA/file2.txt
file4.txt
Note
Blob snapshots cannot be listed in a hierarchical listing operation.
List blobs asynchronously
The Azure Blob Storage client library for Python supports listing blobs asynchronously. To learn more about project setup requirements, see Asynchronous programming.
Follow these steps to list blobs using asynchronous APIs:
- Add the following import statements:
import asyncio
from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob.aio import BlobServiceClient, ContainerClient, BlobPrefix
- Add code to run the program using
asyncio.run
. This function runs the passed coroutine,main()
in our example, and manages theasyncio
event loop. Coroutines are declared with the async/await syntax. In this example, themain()
coroutine first creates the top levelBlobServiceClient
usingasync with
, then calls the method that lists the blobs. Note that only the top level client needs to useasync with
, as other clients created from it share the same connection pool.
async def main():
sample = BlobSamples()
# TODO: Replace <storage-account-name> with your actual storage account name
account_url = "https://<storage-account-name>.blob.core.chinacloudapi.cn"
credential = DefaultAzureCredential()
async with BlobServiceClient(account_url, credential=credential) as blob_service_client:
await sample.list_blobs_flat(blob_service_client, "sample-container")
if __name__ == '__main__':
asyncio.run(main())
- Add code to list the blobs. The following code example lists blobs using a flat listing. The code is the same as the synchronous example, except that the method is declared with the
async
keyword andasync for
is used when calling thelist_blobs
method.
async def list_blobs_flat(self, blob_service_client: BlobServiceClient, container_name):
container_client = blob_service_client.get_container_client(container=container_name)
async for blob in container_client.list_blobs():
print(f"Name: {blob.name}")
With this basic setup in place, you can implement other examples in this article as coroutines using async/await syntax.
Resources
To learn more about how to list blobs using the Azure Blob Storage client library for Python, see the following resources.
Code samples
- View synchronous or asynchronous code samples from this article (GitHub)
REST API operations
The Azure SDK for Python contains libraries that build on top of the Azure REST API, allowing you to interact with REST API operations through familiar Python paradigms. The client library methods for listing blobs use the following REST API operation:
- List Blobs (REST API)
Client library resources
See also
Related content
- This article is part of the Blob Storage developer guide for Python. To learn more, see the full list of developer guide articles at Build your Python app.