管理 Azure Blob 存储生命周期Manage the Azure Blob storage lifecycle

数据集具有独特的生命周期。Data sets have unique lifecycles. 在生命周期的早期,人们经常访问某些数据。Early in the lifecycle, people access some data often. 但随着数据的老化,访问需求急剧下降。But the need for access drops drastically as the data ages. 有些数据在云中保持空闲状态,并且在存储后很少被访问。Some data stays idle in the cloud and is rarely accessed once stored. 有些数据在创建后的数日或者数月即会过期,还有一些数据集在其整个生存期会频繁受到读取和修改。Some data expires days or months after creation, while other data sets are actively read and modified throughout their lifetimes. Azure Blob 存储生命周期管理为 GPv2 和 Blob 存储帐户提供丰富的基于规则的策略。Azure Blob storage lifecycle management offers a rich, rule-based policy for GPv2 and Blob storage accounts. 可使用该策略将数据转移到适当的访问层,或在数据的生命周期结束时使数据过期。Use the policy to transition your data to the appropriate access tiers or expire at the end of the data's lifecycle.

生命周期管理策略允许:The lifecycle management policy lets you:

  • 将 Blob 转移到较冷的存储层(从热到冷、从热到存档,或者从冷到存档),以便针对性能和成本进行优化Transition blobs to a cooler storage tier (hot to cool, hot to archive, or cool to archive) to optimize for performance and cost
  • 删除生命周期已结束的 BlobDelete blobs at the end of their lifecycles
  • 在存储帐户级别定义每天运行一次的规则Define rules to be run once per day at the storage account level
  • 将规则应用到容器或 Blob 子集(使用前缀作为筛选器)Apply rules to containers or a subset of blobs (using prefixes as filters)

假设某个数据集在生命周期的早期阶段频繁被访问,两周后只是偶尔被访问。Consider a scenario where data gets frequent access during the early stages of the lifecycle, but only occasionally after two weeks. 一个月以后,该数据集很少被访问。Beyond the first month, the data set is rarely accessed. 在这种场景下,早期阶段最适合使用热存储。In this scenario, hot storage is best during the early stages. 在偶尔访问阶段最适合使用冷存储。Cool storage is most appropriate for occasional access. 在一个月后数据陈旧时,存档存储便是最佳的层选项。Archive storage is the best tier option after the data ages over a month. 通过根据数据陈旧程度调整存储层,可根据需求设计出最具性价比的存储选项。By adjusting storage tiers in respect to the age of data, you can design the least expensive storage options for your needs. 若要实现这种过渡,可以使用生命周期管理策略规则将陈旧数据转移到较冷的存储层。To achieve this transition, lifecycle management policy rules are available to move aging data to cooler tiers.

存储帐户支持Storage account support

生命周期管理策略适用于常规用途 v2 (GPv2) 帐户和 Blob 存储帐户。The lifecycle management policy is available with General Purpose v2 (GPv2) accounts, and Blob storage accounts. 在 Azure 门户中,可将现有的常规用途 (GPv1) 帐户升级为 GPv2 帐户。In the Azure portal, you can upgrade an existing General Purpose (GPv1) account to a GPv2 account. 有关存储帐户的详细信息,请参阅 Azure 存储帐户概述For more information about storage accounts, see Azure storage account overview.

定价Pricing

生命周期管理功能是免费的。The lifecycle management feature is free of charge. 客户需要支付列出 Blob设置 Blob 层 API 调用的常规操作费用。Customers are charged the regular operation cost for the List Blobs and Set Blob Tier API calls. 删除操作是免费的。Delete operation is free. 有关定价的详细信息,请参阅块 Blob 定价For more information about pricing, see Block Blob pricing.

区域可用性Regional availability

生命周期管理功能已在所有区域中推出。The lifecycle management feature is available in all regions.

添加或删除策略Add or remove a policy

可以使用以下任一方法来添加、编辑或删除策略:You can add, edit, or remove a policy by using any of the following methods:

本文介绍如何使用门户和 PowerShell 方法管理策略。This article shows how to manage policy by using the portal and PowerShell methods.

Note

如果为存储帐户启用了防火墙规则,生命周期管理请求可能会被阻止。If you enable firewall rules for your storage account, lifecycle management requests may be blocked. 可以通过为受信任的 Azure 服务提供例外来取消阻止这些请求。You can unblock these requests by providing exceptions for trusted Azure services. 有关详细信息,请参阅配置防火墙和虚拟网络中的“例外”部分。For more information, see the Exceptions section in Configure firewalls and virtual networks.

Azure 门户Azure portal

可以在 Azure 门户中通过两种方式添加策略。There are two ways to add a policy through the Azure portal.

Azure 门户列表视图Azure portal List view

  1. 登录到 Azure 门户Sign in to the Azure portal.

  2. 选择“所有资源”,然后选择你的存储帐户 。Select All resources and then select your storage account.

  3. 在“Blob 服务”下,选择“生命周期管理”以查看或更改规则 。Under Blob Service, select Lifecycle management to view or change your rules.

  4. 选择“列表视图”选项卡。 Select the List view tab.

  5. 选择“添加规则”,然后填写“操作集”窗体字段。 Select Add rule and then fill out the Action set form fields. 在以下示例中,如果 Blob 有 30 天未修改,它们将转移到冷存储。In the following example, blobs are moved to cool storage if they haven't been modified for 30 days.

    Azure 门户中的生命周期管理操作集页

  6. 选择“筛选器集”添加可选的筛选器。 Select Filter set to add an optional filter. 然后,选择“浏览”以指定作为筛选依据的容器和文件夹。 Then, select Browse to specify a container and folder by which to filter.

    Azure 门户中的生命周期管理筛选器集页

  7. 选择“查看 + 添加”以查看策略设置。 Select Review + add to review the policy settings.

  8. 选择“添加”以添加新策略。 Select Add to add the new policy.

Azure 门户代码视图Azure portal Code view

  1. 登录到 Azure 门户Sign in to the Azure portal.

  2. 选择“所有资源”,然后选择你的存储帐户 。Select All resources and then select your storage account.

  3. 在“Blob 服务”下,选择“生命周期管理”以查看或更改策略 。Under Blob Service, select Lifecycle management to view or change your policy.

  4. 以下 JSON 是可粘贴到“代码视图”选项卡中的策略示例。 The following JSON is an example of a policy that can be pasted into the Code view tab.

    {
      "rules": [
        {
          "name": "ruleFoo",
          "enabled": true,
          "type": "Lifecycle",
          "definition": {
            "filters": {
              "blobTypes": [ "blockBlob" ],
              "prefixMatch": [ "container1/foo" ]
            },
            "actions": {
              "baseBlob": {
                "tierToCool": { "daysAfterModificationGreaterThan": 30 },
                "tierToArchive": { "daysAfterModificationGreaterThan": 90 },
                "delete": { "daysAfterModificationGreaterThan": 2555 }
              },
              "snapshot": {
                "delete": { "daysAfterCreationGreaterThan": 90 }
              }
            }
          }
        }
      ]
    }
    
  5. 选择“保存” 。Select Save.

  6. 有关此 JSON 示例的详细信息,请参阅策略规则部分。For more information about this JSON example, see the Policy and Rules sections.

PowerShellPowerShell

使用以下 PowerShell 脚本可将策略添加到存储帐户。The following PowerShell script can be used to add a policy to your storage account. 必须使用资源组名称初始化 $rgname 变量。The $rgname variable must be initialized with your resource group name. 必须使用存储帐户名称初始化 $accountName 变量。The $accountName variable must be initialized with your storage account name.

#Install the latest module
Install-Module -Name Az -Repository PSGallery

#Initialize the following with your resource group and storage account names
$rgname = ""
$accountName = ""

#Create a new action object
$action = Add-AzStorageAccountManagementPolicyAction -BaseBlobAction Delete -daysAfterModificationGreaterThan 2555
$action = Add-AzStorageAccountManagementPolicyAction -InputObject $action -BaseBlobAction TierToArchive -daysAfterModificationGreaterThan 90
$action = Add-AzStorageAccountManagementPolicyAction -InputObject $action -BaseBlobAction TierToCool -daysAfterModificationGreaterThan 30
$action = Add-AzStorageAccountManagementPolicyAction -InputObject $action -SnapshotAction Delete -daysAfterCreationGreaterThan 90

# Create a new filter object
# PowerShell automatically sets BlobType as “blockblob” because it is the only available option currently
$filter = New-AzStorageAccountManagementPolicyFilter -PrefixMatch ab,cd

#Create a new rule object
#PowerShell automatically sets Type as “Lifecycle” because it is the only available option currently
$rule1 = New-AzStorageAccountManagementPolicyRule -Name Test -Action $action -Filter $filter

#Set the policy
$policy = Set-AzStorageAccountManagementPolicy -ResourceGroupName $rgname -StorageAccountName $accountName -Rule $rule1

包含生命周期管理策略的 Azure 资源管理器模板Azure Resource Manager template with lifecycle management policy

可以使用 Azure 资源管理器模板定义生命周期管理。You can define lifecycle management by using Azure Resource Manager templates. 以下示例模板可以使用生命周期管理策略部署 RA-GRS GPv2 存储帐户。Here is a sample template to deploy a RA-GRS GPv2 storage account with a lifecycle management policy.

{
  "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {},
  "variables": {
    "storageAccountName": "[uniqueString(resourceGroup().id)]"
  },
  "resources": [
    {
      "type": "Microsoft.Storage/storageAccounts",
      "name": "[variables('storageAccountName')]",
      "location": "[resourceGroup().location]",
      "apiVersion": "2019-04-01",
      "sku": {
        "name": "Standard_RAGRS"
      },
      "kind": "StorageV2",
      "properties": {
        "networkAcls": {}
      }
    },
    {
      "name": "[concat(variables('storageAccountName'), '/default')]",
      "type": "Microsoft.Storage/storageAccounts/managementPolicies",
      "apiVersion": "2019-04-01",
      "dependsOn": [
        "[variables('storageAccountName')]"
      ],
      "properties": {
        "policy": {...}
      }
    }
  ],
  "outputs": {}
}

策略Policy

生命周期管理策略是 JSON 文档中的规则集合:A lifecycle management policy is a collection of rules in a JSON document:

{
  "rules": [
    {
      "name": "rule1",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {...}
    },
    {
      "name": "rule2",
      "type": "Lifecycle",
      "definition": {...}
    }
  ]
}

策略是规则的集合:A policy is a collection of rules:

参数名称Parameter name 参数类型Parameter type 注释Notes
rules 规则对象的数组An array of rule objects 一个策略至少需要包含一个规则。At least one rule is required in a policy. 最多可在一个策略中定义 100 个规则。You can define up to 100 rules in a policy.

策略中的每个规则具有多个参数:Each rule within the policy has several parameters:

参数名称Parameter name 参数类型Parameter type 注释Notes 必须Required
name StringString 规则名称最多只能包含 256 个字母数字字符。A rule name can include up to 256 alphanumeric characters. 规则名称区分大小写。Rule name is case-sensitive. 该名称必须在策略中唯一。It must be unique within a policy. TrueTrue
enabled 布尔Boolean 一个允许暂时禁用规则的可选布尔值。An optional boolean to allow a rule to be temporary disabled. 如果未设置,则默认值为 true。Default value is true if it's not set. FalseFalse
type 枚举值An enum value 当前的有效类型为 LifecycleThe current valid type is Lifecycle. TrueTrue
definition 定义生命周期规则的对象An object that defines the lifecycle rule 每个定义均由筛选器集和操作集组成。Each definition is made up of a filter set and an action set. TrueTrue

规则Rules

每个规则定义包括筛选器集和操作集。Each rule definition includes a filter set and an action set. 筛选器集将规则操作限制为容器或对象名称中的某组对象。The filter set limits rule actions to a certain set of objects within a container or objects names. 操作集对筛选的对象集应用分层或删除操作。The action set applies the tier or delete actions to the filtered set of objects.

示例规则Sample rule

以下示例规则将筛选帐户,以针对 container1 中存在的、以 foo 开头的对象运行操作。The following sample rule filters the account to run the actions on objects that exist inside container1 and start with foo.

  • 在上次修改后的 30 天后,将 Blob 分层到冷层Tier blob to cool tier 30 days after last modification
  • 在上次修改后的 90 天后,将 Blob 分层到存档层Tier blob to archive tier 90 days after last modification
  • 在上次修改后的 2,555 天(7 年)后,删除 BlobDelete blob 2,555 days (seven years) after last modification
  • 在创建快照后的 90 天后,删除 Blob 快照Delete blob snapshots 90 days after snapshot creation
{
  "rules": [
    {
      "name": "ruleFoo",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ],
          "prefixMatch": [ "container1/foo" ]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": { "daysAfterModificationGreaterThan": 30 },
            "tierToArchive": { "daysAfterModificationGreaterThan": 90 },
            "delete": { "daysAfterModificationGreaterThan": 2555 }
          },
          "snapshot": {
            "delete": { "daysAfterCreationGreaterThan": 90 }
          }
        }
      }
    }
  ]
}

规则筛选器Rule filters

筛选器将规则操作限制为存储帐户中的 Blob子集。Filters limit rule actions to a subset of blobs within the storage account. 如果定义了多个筛选器,将对所有筛选器运行逻辑 ANDIf more than one filter is defined, a logical AND runs on all filters.

筛选器包括:Filters include:

筛选器名称Filter name 筛选器类型Filter type 注释Notes 是否必需Is Required
blobTypesblobTypes 预定义枚举值的数组。An array of predefined enum values. 当前版本支持 blockBlobThe current release supports blockBlob. Yes
prefixMatchprefixMatch 要匹配的前缀字符串数组。An array of strings for prefixes to be match. 每个规则最多可定义 10 个前缀。Each rule can define up to 10 prefixes. 前缀字符串必须以容器名称开头。A prefix string must start with a container name. 例如,如果要为某个规则匹配 https://myaccount.blob.core.chinacloudapi.cn/container1/foo/... 下的所有 Blob,则 prefixMatch 为 container1/fooFor example, if you want to match all blobs under https://myaccount.blob.core.chinacloudapi.cn/container1/foo/... for a rule, the prefixMatch is container1/foo. 如果未定义 prefixMatch,规则将应用到存储帐户中的所有 Blob。If you don't define prefixMatch, the rule applies to all blobs within the storage account. No

规则操作Rule actions

满足运行条件时,操作将应用到筛选的 Blob。Actions are applied to the filtered blobs when the run condition is met.

生命周期管理支持 Blob 的分层和删除,以及 Blob 快照的删除。Lifecycle management supports tiering and deletion of blobs and deletion of blob snapshots. 在 Blob 或 Blob 快照中为每个规则至少定义一个操作。Define at least one action for each rule on blobs or blob snapshots.

操作Action 基本 BlobBase Blob 快照Snapshot
tierToCooltierToCool 目前支持位于热层的 BlobSupport blobs currently at hot tier 不支持Not supported
tierToArchivetierToArchive 目前支持位于热层或冷层的 BlobSupport blobs currently at hot or cool tier 不支持Not supported
删除delete 支持Supported 支持Supported

Note

如果在同一 Blob 中定义了多个操作,生命周期管理将对该 Blob 应用开销最低的操作。If you define more than one action on the same blob, lifecycle management applies the least expensive action to the blob. 例如,操作 delete 的开销比 tierToArchive 更低。For example, action delete is cheaper than action tierToArchive. 操作 tierToArchive 的开销比 tierToCool 更低。Action tierToArchive is cheaper than action tierToCool.

运行条件基于期限。The run conditions are based on age. 基本 Blob 使用上次修改时间来跟踪陈旧程度,Blob 快照使用快照创建时间来跟踪陈旧程度。Base blobs use the last modified time to track age, and blob snapshots use the snapshot creation time to track age.

操作运行条件Action run condition 条件值Condition value 说明Description
daysAfterModificationGreaterThandaysAfterModificationGreaterThan 指示陈旧程度(天)的整数值Integer value indicating the age in days 基本 Blob 操作的条件The condition for base blob actions
daysAfterCreationGreaterThandaysAfterCreationGreaterThan 指示陈旧程度(天)的整数值Integer value indicating the age in days Blob 快照操作的条件The condition for blob snapshot actions

示例Examples

以下示例演示如何使用生命周期策略规则满足常见场景。The following examples demonstrate how to address common scenarios with lifecycle policy rules.

将陈旧的数据移到冷层Move aging data to a cooler tier

此示例演示如何转移前缀为 container1/foocontainer2/bar 的块 Blob。This example shows how to transition block blobs prefixed with container1/foo or container2/bar. 该策略将 30 天以上未修改的 Blob 转移到冷存储,并将 90 天未修改的 Blob 转移到存档层:The policy transitions blobs that haven't been modified in over 30 days to cool storage, and blobs not modified in 90 days to the archive tier:

{
  "rules": [
    {
      "name": "agingRule",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ],
          "prefixMatch": [ "container1/foo", "container2/bar" ]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": { "daysAfterModificationGreaterThan": 30 },
            "tierToArchive": { "daysAfterModificationGreaterThan": 90 }
          }
        }
      }
    }
  ]
}

引入时存档数据Archive data at ingest

某些数据在云中保持空闲状态,并且在存储后很少(如果有)被访问。Some data stays idle in the cloud and is rarely, if ever, accessed once stored. 以下生命周期策略配置为在引入数据后立即存档数据。The following lifecycle policy is configured to archive data once it's ingested. 此示例将容器 archivecontainer 中的存储帐户中的块 Blob 转移到存档层。This example transitions block blobs in the storage account within container archivecontainer into an archive tier. 转移是通过在上次修改后的 0 天内处理 Blob 实现的:The transition is accomplished by acting on blobs 0 days after last modified time:

{
  "rules": [
    {
      "name": "archiveRule",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ],
          "prefixMatch": [ "archivecontainer" ]
        },
        "actions": {
          "baseBlob": {
              "tierToArchive": { "daysAfterModificationGreaterThan": 0 }
          }
        }
      }
    }
  ]
}

基于陈旧程度使数据过期Expire data based on age

某些数据预期在创建后的数日或数月内过期。Some data is expected to expire days or months after creation. 可以将生命周期管理策略配置为:根据数据陈旧程度删除数据,以使数据过期。You can configure a lifecycle management policy to expire data by deletion based on data age. 以下示例中演示的策略删除超过 365 天的所有块 Blob。The following example shows a policy that deletes all block blobs older than 365 days.

{
  "rules": [
    {
      "name": "expirationRule",
      "enabled": true,
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ]
        },
        "actions": {
          "baseBlob": {
            "delete": { "daysAfterModificationGreaterThan": 365 }
          }
        }
      }
    }
  ]
}

删除旧快照Delete old snapshots

对于在整个生存期内频繁修改和访问的数据,通常会使用快照来跟踪数据的旧版本。For data that is modified and accessed regularly throughout its lifetime, snapshots are often used to track older versions of the data. 可以创建一个策略,用于根据快照的陈旧程度删除旧快照。You can create a policy that deletes old snapshots based on snapshot age. 可通过评估快照创建时间来确定快照的陈旧程度。The snapshot age is determined by evaluating the snapshot creation time. 此策略规则删除容器 activedata 中自创建快照后达到或超过 90 天的块 Blob 快照。This policy rule deletes block blob snapshots within container activedata that are 90 days or older after snapshot creation.

{
  "rules": [
    {
      "name": "snapshotRule",
      "enabled": true,
      "type": "Lifecycle",
    "definition": {
        "filters": {
          "blobTypes": [ "blockBlob" ],
          "prefixMatch": [ "activedata" ]
        },
        "actions": {
          "snapshot": {
            "delete": { "daysAfterCreationGreaterThan": 90 }
          }
        }
      }
    }
  ]
}

常见问题FAQ

我创建了一个新策略,但操作为什么没有立即运行?I created a new policy, why do the actions not run immediately?
平台每天运行一次生命周期策略。The platform runs the lifecycle policy once a day. 配置策略后,某些操作可能需要在长达 24 小时之后才能首次运行。Once you configure a policy, it can take up to 24 hours for some actions to run for the first time.

我手动解冻了某个存档的 Blob,如何防止它暂时性地移回到存档层?I manually rehydrate an archived blob, how do I prevent it from being moved back to the Archive tier temporarily?
将 Blob 从一个访问层移到另一个访问层后,其上次修改时间不会更改。When a blob is moved from one access tier to another, its last modification time doesn't change. 如果手动将存档的 Blob 解冻到热层,生命周期管理引擎会将它移回到存档层。If you manually rehydrate an archived blob to hot tier, it would be moved back to archive tier by the lifecycle management engine. 暂时禁用影响此 Blob 的规则可防止该 Blob 再次存档。Disable the rule that affects this blob temporarily to prevent it from being archived again. 如果需要将 Blob 永久保留在热层,请将其复制到另一个位置。Copy the blob to another location if it needs to stay in hot tier permanently. 可以安全地将 Blob 移回到存档层时,重新启用该规则即可。Re-enable the rule when the blob can be safely moved back to archive tier.

后续步骤Next steps

了解如何在意外删除数据后恢复数据:Learn how to recover data after accidental deletion: