计算要计费的 Blob 容器总大小Calculate the total billing size of a blob container

此脚本出于估算计费成本的目的,计算 Azure Blob 存储中的容器大小。This script calculates the size of a container in Azure Blob storage for the purpose of estimating billing costs. 此脚本计算容器中各 blob 的大小总和。The script totals the size of the blobs in the container.

本示例需要 Azure PowerShell。This sample requires Azure PowerShell. 运行 Get-Module -ListAvailable Az 即可查找版本。Run Get-Module -ListAvailable Az to find the version. 如果需要进行安装或升级,请参阅安装 Azure PowerShell 模块If you need to install or upgrade, see Install Azure PowerShell module.

运行 Connect-AzAccount -Environment AzureChinaCloud,创建与 Azure 的连接。Run Connect-AzAccount -Environment AzureChinaCloud to create a connection with Azure.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a trial account before you begin.

Note

此 PowerShell 脚本出于计费目的计算容器大小。This PowerShell script calculates the size of a container for billing purposes. 如果要出于其他目的计算容器大小,请参阅计算 Blob 存储容器的总大小,获取进行估算的更简单的脚本。If you are calculating container size for other purposes, see Calculate the total size of a Blob storage container for a simpler script that provides an estimate.

确定 Blob 容器的大小Determine the size of the blob container

Blob 容器的总大小包括容器自身大小,以及容器内所有 blob 的大小。The total size of the blob container includes the size of the container itself and the size of all blobs under the container.

下述部分介绍 Blob 容器和 blob 的存储容量计算方法。The following sections describes how the storage capacity is calculated for blob containers and blobs. 在下一部分中,Len(X) 表示字符串中的字符数。 In the following section, Len(X) means the number of characters in the string.

Blob 容器Blob containers

下述计算介绍如何估算每个 Blob 容器使用的存储量:The following calculation describes how to estimate the amount of storage that's consumed per blob container:

48 bytes + Len(ContainerName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + For-Each Signed Identifier[512 bytes]

以下是明细信息:Following is the breakdown:

  • 每个容器 48 字节的开销,包括上次修改时间、权限、公共设置,以及其他系统元数据。48 bytes of overhead for each container includes the Last Modified Time, Permissions, Public Settings, and some system metadata.

  • 容器名称以 Unicode 形式存储,因此字节数按字符数乘以 2 计算。The container name is stored as Unicode, so take the number of characters and multiply by two.

  • 对于存储的每个 Blob 容器元数据块,我们将存储名称 (ASCII) 长度,再加上字符串值的长度。For each block of blob container metadata that's stored, we store the length of the name (ASCII), plus the length of the string value.

  • 每个签名标识符(512 字节)包括签名标识符名称、开始时间、到期时间和权限。The 512 bytes per Signed Identifier includes signed identifier name, start time, expiry time, and permissions.

BlobBlobs

下述计算显示如何估算每个 blob 使用的存储量。The following calculations show how to estimate the amount of storage consumed per blob.

  • 块 blob(基本 blob 或快照):Block blob (base blob or snapshot):

    124 bytes + Len(BlobName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + 8 bytes + number of committed and uncommitted blocks * Block ID Size in bytes + SizeInBytes(data in unique committed data blocks stored) + SizeInBytes(data in uncommitted data blocks)

  • 页 blob(基本 blob 或快照):Page blob (base blob or snapshot):

    124 bytes + Len(BlobName) * 2 bytes + For-Each Metadata[3 bytes + Len(MetadataName) + Len(Value)] + number of nonconsecutive page ranges with data * 12 bytes + SizeInBytes(data in unique pages stored)

以下是明细信息:Following is the breakdown:

  • Blob 的开销为 124 字节,其中包括:124 bytes of overhead for blob, which includes:

    • 上次修改时间Last Modified Time
    • 大小Size
    • Cache-ControlCache-Control
    • Content-TypeContent-Type
    • Content-LanguageContent-Language
    • Content-EncodingContent-Encoding
    • Content-MD5Content-MD5
    • 权限Permissions
    • 快照信息Snapshot information
    • 租约Lease
    • 某些系统元数据Some system metadata
  • Blob 名称以 Unicode 形式存储,因此字节数按字符数乘以 2 计算。The blob name is stored as Unicode, so take the number of characters and multiply by two.

  • 对于每个存储的元数据块,添加名称长度(以 ASCII 码存储),再加上字符串值的长度。For each block of metadata that's stored, add the length of the name (stored as ASCII), plus the length of the string value.

  • 对于块 blob:For the block blobs:

    • 块列表为 8 字节。8 bytes for the block list.

    • 块数乘以块 ID 大小(按字节计)。Number of blocks times the block ID size in bytes.

    • 所有已提交和未提交块中数据的大小。The size of the data in all of the committed and uncommitted blocks.

      Note

      使用快照时,大小仅包括此基本或快照 blob 的唯一数据。When snapshots are used, this size includes only the unique data for this base or snapshot blob. 如果未提交块在一周后未被使用,则回收到垃圾桶。If the uncommitted blocks are not used after a week, they are garbage-collected. 之后不计入账单。After that, they don't count toward billing.

  • 对于页 blob:For page blobs:

    • 字节数按具有数据的不连续页面范围数乘以 12 计算。The number of nonconsecutive page ranges with data times 12 bytes. 这是在调用 GetPageRanges API 时看到的唯一页面范围数。This is the number of unique page ranges you see when calling the GetPageRanges API.

    • 所有存储页面中的数据大小(按字节计)。The size of the data in bytes of all of the stored pages.

      Note

      使用快照时,大小仅包含要计数的基本 blob 或快照 blob 的唯一页面。When snapshots are used, this size includes only the unique pages for the base blob or the snapshot blob that's being counted.

示例脚本Sample script


# this script will show how to get the total size of the blobs in a container
# before running this, you need to create a storage account, create a container,
#    and upload some blobs into the container
# note: this retrieves all of the blobs in the container in one command.
#       connect Azure with Login-AzAccount -EnvironmentName AzureChinaCloud before you run the script.
#       requests sent as part of this tool will incur transactional costs. 
# command line usage: script.ps1 -ResourceGroup {YourResourceGroupName} -StorageAccountName {YourAccountName} -ContainerName {YourContainerName}
#


param(
    [Parameter(Mandatory=$true)]
    [string]$ResourceGroup,

    [Parameter(Mandatory=$true)]
    [string]$StorageAccountName,

    [Parameter(Mandatory=$true)]
    [string]$ContainerName
)

#Set-StrictMode will cause Get-AzureStorageBlob returns result in different data types when there is only one blob
#Set-StrictMode -Version 2

$VerbosePreference = "Continue"

if((Get-Module -ListAvailable Az.Storage) -eq $null)
{
    throw "Azure Powershell not found! Please install from https://docs.microsoft.com/en-us/powershell/azure/install-Az-ps"
}

# function Retry-OnRequest
function Retry-OnRequest
{
    param(
        [Parameter(Mandatory=$true)]
        $Action)
    
    # It could encounter various of temporary errors, like network errors, or storage server busy errors.
    # Should retry the request on transient errors

    # Retry on storage server timeout errors
    $clientTimeOut = New-TimeSpan -Minutes 15
    $retryPolicy = New-Object -TypeName Microsoft.WindowsAzure.Storage.RetryPolicies.ExponentialRetry -ArgumentList @($clientTimeOut, 10)        
    $requestOption = @{}
    $requestOption.RetryPolicy = $retryPolicy

    # Retry on temporary network errors
    $shouldRetryOnException = $false
    $maxRetryCountOnException = 3

    do
    {
        try
        {
            return $Action.Invoke($requestOption)
        }
        catch
        {
            if ($_.Exception.InnerException -ne $null -And $_.Exception.InnerException.GetType() -Eq [System.TimeoutException] -And $maxRetryCountOnException -gt 0)
            {
                $shouldRetryOnException = $true
                $maxRetryCountOnException--
            }
            else
            {
                $shouldRetryOnException = $false
                throw
            }
        }
    }
    while ($shouldRetryOnException)

}

# function Get-BlobBytes

function Get-BlobBytes
{
    param(
        [Parameter(Mandatory=$true)]
        $Blob,
        [Parameter(Mandatory=$false)]
        [bool]$IsPremiumAccount = $false)

    # Base + blobname
    $blobSizeInBytes = 124 + $Blob.Name.Length * 2

    # Get size of metadata
    $metadataEnumerator=$Blob.ICloudBlob.Metadata.GetEnumerator()
    while($metadataEnumerator.MoveNext())
    {
        $blobSizeInBytes += 3 + $metadataEnumerator.Current.Key.Length + $metadataEnumerator.Current.Value.Length
    }

    if (!$IsPremiumAccount)
    {
        if($Blob.BlobType -eq [Microsoft.WindowsAzure.Storage.Blob.BlobType]::BlockBlob)
        {
            $blobSizeInBytes += 8
            # Default is Microsoft.WindowsAzure.Storage.Blob.BlockListingFilter.Committed. Need All
            $action = { param($requestOption) return $Blob.ICloudBlob.DownloadBlockList([Microsoft.WindowsAzure.Storage.Blob.BlockListingFilter]::All, $null, $requestOption) }                

            $blocks=Retry-OnRequest $action      

            if ($null -eq $blocks)
            {
                $blobSizeInBytes += $Blob.ICloudBlob.Properties.Length
            }
            else
            {
                $blocks | ForEach-Object { $blobSizeInBytes += $_.Length + $_.Name.Length }
            }  
        }
        elseif($Blob.BlobType -eq [Microsoft.WindowsAzure.Storage.Blob.BlobType]::PageBlob)
        {
            # It could cause server time out issue when trying to get page ranges of highly fragmented page blob 
            # Get page ranges in segment can mitigate chance of meeting such kind of server time out issue
            # See https://blogs.msdn.microsoft.com/windowsazurestorage/2012/03/26/getting-the-page-ranges-of-a-large-page-blob-in-segments/ for details.
            $pageRangesSegSize = 148 * 1024 * 1024L
            $totalSize = $Blob.ICloudBlob.Properties.Length
            $pageRangeSegOffset = 0
        
            $pageRangesTemp = New-Object System.Collections.ArrayList
        
            while ($pageRangeSegOffset -lt $totalSize)
            {
                $action = {param($requestOption) return $Blob.ICloudBlob.GetPageRanges($pageRangeSegOffset, $pageRangesSegSize, $null, $requestOption) }

                Retry-OnRequest $action | ForEach-Object { $pageRangesTemp.Add($_) }  | Out-Null
                $pageRangeSegOffset += $pageRangesSegSize
            }

            $pageRanges = New-Object System.Collections.ArrayList

            foreach ($pageRange in $pageRangesTemp)
            {
                if($lastRange -eq $Null)
                {
                    $lastRange = New-Object PageRange
                    $lastRange.StartOffset = $pageRange.StartOffset
                    $lastRange.EndOffset =  $pageRange.EndOffset
                }
                else
                {
                    if (($lastRange.EndOffset + 1) -eq $pageRange.StartOffset)
                    {
                        $lastRange.EndOffset = $pageRange.EndOffset
                    }
                    else
                    {
                        $pageRanges.Add($lastRange)  | Out-Null
                        $lastRange = New-Object PageRange
                        $lastRange.StartOffset = $pageRange.StartOffset
                        $lastRange.EndOffset =  $pageRange.EndOffset
                    }
                }
            }

            $pageRanges.Add($lastRange) | Out-Null
            $pageRanges |  ForEach-Object { 
                    $blobSizeInBytes += 12 + $_.EndOffset - $_.StartOffset 
                }
        }
        else
        {
            $blobSizeInBytes += $Blob.ICloudBlob.Properties.Length
        }
        return $blobSizeInBytes
    }
    else
    {
        $blobSizeInBytes += $Blob.ICloudBlob.Properties.Length
    }
    return $blobSizeInBytes
}

# function Get-ContainerBytes

function Get-ContainerBytes
{
    param(
        [Parameter(Mandatory=$true)]
        [Microsoft.WindowsAzure.Storage.Blob.CloudBlobContainer]$Container,
        [Parameter(Mandatory=$false)]
        [bool]$IsPremiumAccount = $false)

    # Base + name of container
    $containerSizeInBytes = 48 + $Container.Name.Length*2

    # Get size of metadata
    $metadataEnumerator = $Container.Metadata.GetEnumerator()
    while($metadataEnumerator.MoveNext())
    {
        $containerSizeInBytes += 3 + $metadataEnumerator.Current.Key.Length + $metadataEnumerator.Current.Value.Length
    }

    # Get size for SharedAccessPolicies
    $containerSizeInBytes += $Container.GetPermissions().SharedAccessPolicies.Count * 512

    # Calculate size of all blobs.
    $blobCount = 0
    $Token = $Null
    $MaxReturn = 5000

    do {
        $Blobs = Get-AzStorageBlob -Context $storageContext -Container $Container.Name -MaxCount $MaxReturn -ContinuationToken $Token
        if($Blobs -eq $Null) { break }

        #Set-StrictMode will cause Get-AzureStorageBlob returns result in different data types when there is only one blob
        if($Blobs.GetType().Name -eq "AzureStorageBlob")
        {
            $Token = $Null
        }
        else
        {
            $Token = $Blobs[$Blobs.Count - 1].ContinuationToken;
        }

        $Blobs | ForEach-Object {
                $blobSize = Get-BlobBytes $_ $IsPremiumAccount
                $containerSizeInBytes += $blobSize
                $blobCount++

                if(($blobCount % 1000) -eq 0)
                {
                    Write-Verbose("Counting {0} Sizing {1} " -f $blobCount, $containerSizeInBytes)
                }
            }
    }
    While ($Token -ne $Null)

    return @{ "containerSize" = $containerSizeInBytes; "blobCount" = $blobCount }
}

#Login-AzAccount -EnvironmentName AzureChinaCloud

$storageAccount = Get-AzStorageAccount -ResourceGroupName $ResourceGroup -Name $StorageAccountName -ErrorAction SilentlyContinue
if($storageAccount -eq $null)
{
    throw "The storage account specified does not exist in this subscription."
}

$storageContext = $storageAccount.Context

if (-not ([System.Management.Automation.PSTypeName]'PageRange').Type)
{
    $Source = "
        public class PageRange
        {
            public long StartOffset;
            public long EndOffset;
        }"
    Add-Type -TypeDefinition $Source
}

$containers = New-Object System.Collections.ArrayList
if($ContainerName.Length -ne 0)
{
    $container = Get-AzStorageContainer -Context $storageContext -Name $ContainerName -ErrorAction SilentlyContinue |
        ForEach-Object { $containers.Add($_) } | Out-Null
}
else
{
    Get-AzStorageContainer -Context $storageContext | ForEach-Object { $containers.Add($_) } | Out-Null
}

$sizeInBytes = 0
$IsPremiumAccount = ($storageAccount.Sku.Tier -eq "Premium")

if($containers.Count -gt 0)
{
    $containers | ForEach-Object {
        Write-Output("Calculating container {0} ..." -f $_.CloudBlobContainer.Name)
        $result = Get-ContainerBytes $_.CloudBlobContainer $IsPremiumAccount
        $sizeInBytes += $result.containerSize

        Write-Output("Container '{0}' with {1} blobs has a sizeof {2:F2} MB." -f $_.CloudBlobContainer.Name,$result.blobCount,($result.containerSize/1MB))
    }
}
else
{
    Write-Warning "No containers found to process in storage account '$StorageAccountName'."
}

后续步骤Next steps