Manage an enrichment cache

Important

This feature is in public preview under supplemental terms of use. The preview REST API supports this feature.

An enrichment cache is an optional feature that stores reusable enriched content created during skillset execution so that only new and changed skills and documents incur standard processing charges during future indexer and skillset processing.

The enrichment cache is created in Azure Storage. The cache contains the output from document cracking, plus the outputs of each skill for every document. Although caching is billable (it uses Azure Storage), the overall cost of enrichment is reduced because the costs of storage are less than image extraction and AI processing.

If you have configured an enrichment cache, this article explains how to manage skill and data source updates so that you get maximum utility from cached enrichments.

Prerequisites

An indexer and skillset
An enrichment cache

Limitations

Caution

If you're using the SharePoint Online indexer , you should avoid incremental enrichment. Under certain circumstances, the cache becomes invalid, requiring an indexer reset and full rebuild, should you choose to reload it.

Cache configuration

Physically, the cache is stored in a blob container and tables in your Azure Storage account, one per indexer. Each indexer is assigned a unique and immutable cache identifier that corresponds to the container it's using.

The cache is created when you specify the cache property and run the indexer. Only enriched content can be cached. If your indexer doesn't have an attached skillset, then caching doesn't apply.

The following example illustrates an indexer with caching enabled. See Configure enrichment caching for full instructions.

POST https://[YOUR-SEARCH-SERVICE-NAME].search.azure.cn/indexers?api-version=2026-05-01-preview
    {
        "name": "myIndexerName",
        "targetIndexName": "myIndex",
        "dataSourceName": "myDatasource",
        "skillsetName": "mySkillset",
        "cache" : {
            "storageConnectionString" : "<Your storage account connection string>",
            "enableReprocessing": true
        },
        "fieldMappings" : [],
        "outputFieldMappings": [],
        "parameters": []
    }

Cache management

The lifecycle of the cache is managed by the indexer. If an indexer is deleted, its cache is also deleted. If the cache property on the indexer is set to null or the connection string is changed, the existing cache is deleted on the next indexer run.

While incremental enrichment is designed to detect and respond to changes with no intervention on your part, you can set parameters to invoke specific behaviors:

Prioritize new documents
Bypass skillset checks
Bypass data source checks
Force skillset evaluation

Prioritize new documents

The cache property includes an enableReprocessing parameter that controls whether cached content is reprocessed. When true (default), cached documents are reprocessed when you rerun the indexer if a skill update affects them.

When false, existing documents aren't reprocessed, effectively prioritizing new content. Set enableReprocessing to false only temporarily. Keeping it true most of the time ensures that both new and existing documents remain valid for the current skillset definition.

Bypass skillset evaluation

Modifying a skill and reprocessing of that skill typically go hand in hand. However, some changes to a skill shouldn't result in reprocessing (for example, deploying a custom skill to a new location or with a new access key). Most likely, these are peripheral modifications that have no genuine impact on the substance of the skill output itself.

If you know that a change to the skill is indeed superficial, you should override skill evaluation by setting the disableCacheReprocessingChangeDetection parameter to true:

Call Update Skillset and modify the skillset definition.
Append the "disableCacheReprocessingChangeDetection=true" parameter on the request.
Submit the change.

Setting this parameter ensures that only updates to the skillset definition are committed and the change isn't evaluated for effects on the existing cache. Use a preview API version, 2020-06-30-Preview or later. We recommend the latest preview API.

PUT https://[servicename].search.azure.cn/skillsets/[skillset name]?api-version=2026-05-01-preview&disableCacheReprocessingChangeDetection

Bypass data source validation checks

Most changes to a data source definition will invalidate the cache. However, for scenarios where you know that a change shouldn't invalidate the cache - such as changing a connection string or rotating the key on the storage account - append the ignoreResetRequirement parameter on the data source update. Setting this parameter to true allows the commit to go through, without triggering a reset condition that would result in all objects being rebuilt and populated from scratch.

PUT https://[search service].search.azure.cn/datasources/[data source name]?api-version=2026-05-01-preview&ignoreResetRequirement

Force skillset evaluation

The purpose of the cache is to avoid unnecessary processing, but suppose you make a change to a skill that the indexer doesn't detect (for example, changing something in external code, such as a custom skill).

In this case, you can use the Reset Skills to force reprocessing of a particular skill, including any downstream skills that have a dependency on that skill's output. This API accepts a POST request with a list of skills that should be invalidated and marked for reprocessing. After Reset Skills, follow with a Run Indexer request to invoke the pipeline processing.

Re-cache specific documents

Resetting an indexer will result in all documents in the search corpus being reprocessed.

In scenarios where only a few documents need to be reprocessed, use Reset Documents (preview) to force reprocessing of specific documents. When a document is reset, the indexer invalidates the cache for that document, which is then reprocessed by reading it from the data source. For more information, see Run or reset indexers, skills, and documents.

To reset specific documents, the request provides a list of document keys as read from the search index. If the key is mapped to a field in the external data source, the value that you provide should be the one used in the search index.

Depending on how you call the API, the request will either append, overwrite, or queue up the key list:

Calling the API multiple times with different keys appends the new keys to the list of document keys reset.
Calling the API with the "overwrite" query string parameter set to true will overwrite the current list of document keys to be reset with the request's payload.
Calling the API only results in the document keys being added to the queue of work the indexer performs. When the indexer is next invoked, either as scheduled or on demand, it will prioritize processing the reset document keys before any other changes from the data source.

The following example illustrates a reset document request:

POST https://[search service name].search.azure.cn/indexers/[indexer name]/resetdocs?api-version=2026-05-01-preview
    {
        "documentKeys" : [
            "key1",
            "key2",
            "key3"
        ]
    }

Changes that invalidate the cache

Once you enable a cache, the indexer evaluates changes in your pipeline composition to determine which content can be reused and which needs reprocessing. This section enumerates changes that invalidate the cache outright, followed by changes that trigger incremental processing.

An invalidating change is one where the entire cache is no longer valid. An example of an invalidating change is one where your data source is updated. Here's the complete list of changes to any part of the indexer pipeline that would invalidate your cache:

Changing the data source type
Changing data source container
Changing data source credentials
Changing data source change detection policy
Changing data source delete detection policy
Changing indexer field mappings
Changing indexer parameters:
- Parsing Mode
- Excluded File Name Extensions
- Indexed File Name Extensions
- Index storage metadata only for oversized documents
- Delimited text headers
- Delimited text delimiter
- Document Root
- Image Action (Changes to how images are extracted)

Changes that trigger incremental processing

Incremental processing evaluates your skillset definition and determines which skills to rerun, selectively updating the affected portions of the document tree. Here's the complete list of changes resulting in incremental enrichment:

Changing the skill type (the OData type of the skill is updated)
Skill-specific parameters are updated, for example a URL, defaults, or other parameters
Skill output changes, the skill returns additional or different outputs
Skill input changes resulting in different ancestry, skill chaining has changed
Any upstream skill invalidation, if a skill that provides an input to this skill is updated
Updates to the knowledge store projection location, results in re-projecting documents
Changes to the knowledge store projections, results in re-projecting documents
Output field mappings changed on an indexer results in re-projecting documents to the index

APIs used for caching

Preview APIs provide extra properties on indexers. We recommend the latest preview API.

Skillsets and data sources can use the generally available version. In addition to the reference documentation, see Configure caching for incremental enrichment for details about order of operations.

Create or Update Indexer (api-version=2026-05-01-preview)
Reset Skills (api-version=2026-05-01-preview)
Create or Update Skillset (api-version=2026-05-01-preview) (New URI parameter on the request)
Create or Update Data Source (api-version=2026-05-01-preview), when called with a preview API version, provides a new parameter named "ignoreResetRequirement", which should be set to true when your update action shouldn't invalidate the cache. Use "ignoreResetRequirement" sparingly as it could lead to unintended inconsistency in your data that won't be detected easily.

Last updated on 2026-06-22