This article explains how to add caching to a skillset pipeline so that you can modify downstream enrichment steps without a full rebuild every time. By default, a skillset is stateless, and changing any part of its composition requires a full rerun of the indexer. With an enrichment cache, the indexer determines which parts of the document tree must be refreshed based on skillset or indexer definition changes. Existing processed output is preserved and reused where possible.
Cached content is placed in Azure Storage using a connection string that you provide. These objects are created when you run the indexer. It should be considered an internal component managed by your search service and must not be modified.
- A container named
ms-az-search-indexercache-<alpha-numeric-string>
- Tables named
MsAzSearchIndexerCacheIndex<alpha-numeric-string>
Prerequisites
Azure Storage for storing cached enrichments. The storage account must be general purpose v2.
For blob indexing only, if you need synchronized document removal from both the cache and index when blobs are deleted from your data source, enable a deletion policy in the indexer. Without this policy, document deletion from the cache isn't supported.
You should be familiar with setting up indexers and skillsets. Start with indexer overview and then continue on to skillsets to learn about enrichment pipelines.
Limitations
Caution
If you're using the SharePoint Online indexer , you should avoid incremental enrichment. Under certain circumstances, the cache becomes invalid, requiring an indexer reset and full rebuild, should you choose to reload it.
Permissions
An Azure AI Search identity needs write-access to Azure Storage:
- Storage Blob Data Contributor
- Storage Table Data Contributor
The connection string syntax determines whether a system-assigned or user-assigned identity is used. For more information, see Connect to Azure Storage using a managed identity.
Set the cache property
Use this procedure for both new and existing indexers.
In the indexer definition, set cache with:
- (Required)
storageConnectionString set to an Azure Storage connection string.
- (Optional)
enableReprocessing (true by default). Set it to false to suspend incremental enrichment temporarily, and switch it back to true later.
On the left, select Indexers.
Select Add indexer to create a new indexer, or open an existing one in JSON edit mode.
Enable incremental enrichment, set the enrichment cache storage account, and save the indexer.
Reset the indexer if it already exists.
Run the indexer. This one-time full rebuild seeds the cache. After it's loaded, incremental reuse applies on subsequent runs.
We recommend that you do a GET Indexer if you're editing an existing indexer.
Use the latest preview API for Create or Update Indexer.
PUT https://[YOUR-SEARCH-SERVICE].search.azure.cn/indexers/[YOUR-INDEXER-NAME]?api-version=2025-11-01-preview
Content-Type: application/json
api-key: [YOUR-ADMIN-KEY]
{
"name": "<YOUR-INDEXER-NAME>",
"targetIndexName": "<YOUR-INDEX-NAME>",
"dataSourceName": "<YOUR-DATASOURCE-NAME>",
"skillsetName": "<YOUR-SKILLSET-NAME>",
"cache": {
"storageConnectionString": "<YOUR-STORAGE-ACCOUNT-CONNECTION-STRING>",
"enableReprocessing": true
},
"fieldMappings": [],
"outputFieldMappings": [],
"parameters": []
}
Reset the indexer if it already exists.
Run the indexer. This one-time full rebuild seeds the cache. After it's loaded, incremental reuse applies on subsequent runs.
POST https://[YOUR-SEARCH-SERVICE].search.azure.cn/indexers/[YOUR-INDEXER-NAME]/run?api-version=2025-11-01-preview
Content-Type: application/json
api-key: [YOUR-ADMIN-KEY]
If you now issue another GET request on the indexer, the response from the service includes an ID property in the cache object. The string is appended to the name of the container containing all the cached results and intermediate state of each document processed by this indexer. The ID is used to uniquely name the cache in Blob storage.
"cache": {
"ID": "<ALPHA-NUMERIC STRING>",
"enableReprocessing": true,
"storageConnectionString": "DefaultEndpointsProtocol=https;AccountName=<YOUR-STORAGE-ACCOUNT>;AccountKey=<YOUR-STORAGE-KEY>;EndpointSuffix=core.chinacloudapi.cn"
}
Check for cached output
Sign in to the Azure portal and find your Azure Storage account.
Use Storage Browser to review containers and tables.
A cache is created and used by an indexer. Its content isn't human readable.
To verify whether the cache is operational, modify a skillset and run the indexer, then compare before-and-after metrics for execution time and document count.
Skillsets that include image analysis and Optical Character Recognition (OCR) of scanned documents make good test cases. If you modify a downstream text skill or any skill that isn't image-related, the indexer can retrieve previously processed image and OCR content from cache, and process only text-related changes from your edits. You can expect fewer documents in indexer execution counts, shorter execution times, and lower costs.
The file set used in cog-search-demo tutorials is a useful test case because it contains 14 files of various formats JPG, PNG, HTML, DOCX, PPTX, and other types. Change en to es or another language in the text translation skill for proof-of-concept testing of incremental enrichment.
Common errors
The following error occurs if you forget to specify a preview API version on the request:
"The request is invalid. Details: indexer : A resource without a type name was found, but no expected type was specified. To allow entries without type information, the expected type must also be specified when the model is specified."
A 400 Bad Request error will also occur if you're missing an indexer requirement. The error message specifies any missing dependencies.
Next step
Incremental enrichment is applicable on indexers that contain skillsets, providing reusable content for both indexes and knowledge stores. The following link provides more information about cache management.