This article uses the Azure Blob Storage inventory feature and Azure Synapse to calculate the blob count and total size of blobs per container. These values are useful when optimizing blob usage per container.
Calculate blob count and total size per container using Azure Storage inventory
Prerequisites
Enable inventory reports
The first step in this method is to enable inventory reports on your storage account. You may have to wait up to 24 hours after enabling inventory reports for your first report to be generated.
When you have an inventory report to analyze, grant yourself read access to the container where the report CSV file resides by assigning yourself the Storage Blob Data Reader role. Be sure to use the email address of the account you're using to run the report. To learn how to assign an Azure role to a user with Azure role-based access control (Azure RBAC), follow the instructions provided in Assign Azure roles using the Azure portal.
Note
To calculate the blob size from the inventory report, make sure to include the Content-Length schema field in your rule definition.
Create an Azure Synapse workspace
Next, create an Azure Synapse workspace where you will execute a SQL query to report the inventory results.
Create the SQL query
After you create your Azure Synapse workspace, do the following steps.
Navigate to https://web.azuresynapse.azure.cn.
Select the Develop tab on the left edge.
Select the large plus sign (+) to add an item.
Select SQL script.
Run the SQL query
Follow the below steps:
Add the following SQL query in your Azure Synapse workspace to read the inventory CSV file.
For the
bulk
parameter, use the URL of the inventory report CSV file that you want to analyze.SELECT LEFT([Name], CHARINDEX('/', [Name]) - 1) AS Container, COUNT(*) As TotalBlobCount, SUM([Content-Length]) As TotalBlobSize FROM OPENROWSET( bulk '<URL to your inventory CSV file>', format='csv', parser_version='2.0', header_row=true ) AS Source GROUP BY LEFT([Name], CHARINDEX('/', [Name]) - 1)
Name your SQL query in the properties pane on the right.
Publish your SQL query by pressing CTRL+S or selecting the Publish all button.
Select the Run button to execute the SQL query. The blob count and total size per container are reported in the Results pane.