Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This feature is currently in public preview. This preview is provided without a service-level agreement and isn't recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Azure Previews.
Use a blob knowledge source to index and query Azure blob content in an agentic retrieval pipeline. Knowledge sources are created independently, referenced in a knowledge base, and used as grounding data when an agent or chatbot calls a retrieve action at query time.
Unlike a search index knowledge source, which specifies an existing and qualified index, a blob knowledge source specifies an external data source, models, and properties to automatically generate the following Azure AI Search objects:
- A data source that represents a blob container.
- A skillset that chunks and optionally vectorizes multimodal content from the container.
- An index that stores enriched content and meets the criteria for agentic retrieval.
- An indexer that uses the previous objects to drive the indexing and enrichment pipeline.
Note
If user access is specified at the document (blob) level in Azure Storage, a knowledge source can carry permission metadata forward to indexed content in Azure AI Search. For more information, see ADLS Gen2 permission metadata or Blob RBAC scopes.
Usage support
| Azure portal | .NET SDK | Python SDK | Java SDK | JavaScript SDK | REST API |
|---|---|---|---|---|---|
| ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Prerequisites
Azure AI Search in any region that provides agentic retrieval. You must have semantic ranker enabled.
An Azure Blob Storage or Azure Data Lake Storage (ADLS) Gen2 account.
A blob container with supported content types for text content. For optional image verbalization, the supported content type depends on whether your chat completion model can analyze and describe the image file.
Permission to create and use objects on Azure AI Search. We recommend role-based access, but you can use API keys if a role assignment isn't feasible. For more information, see Connect to a search service.
- The latest
Azure.Search.Documentspreview package:dotnet add package Azure.Search.Documents --prerelease
- The latest
azure-search-documentspreview package:pip install --pre azure-search-documents
- The 2025-11-01-preview version of the Search Service REST APIs.
Check for existing knowledge sources
A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.
Run the following code to list knowledge sources by name and type.
// List knowledge sources by name and type
using Azure.Search.Documents.Indexes;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
var knowledgeSources = indexClient.GetKnowledgeSourcesAsync();
Console.WriteLine("Knowledge Sources:");
await foreach (var ks in knowledgeSources)
{
Console.WriteLine($" Name: {ks.Name}, Type: {ks.GetType().Name}");
}
You can also return a single knowledge source by name to review its JSON definition.
using Azure.Search.Documents.Indexes;
using System.Text.Json;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential);
// Specify the knowledge source name to retrieve
string ksNameToGet = "earth-knowledge-source";
// Get its definition
var knowledgeSourceResponse = await indexClient.GetKnowledgeSourceAsync(ksNameToGet);
var ks = knowledgeSourceResponse.Value;
// Serialize to JSON for display
var jsonOptions = new JsonSerializerOptions
{
WriteIndented = true,
DefaultIgnoreCondition = System.Text.Json.Serialization.JsonIgnoreCondition.Never
};
Console.WriteLine(JsonSerializer.Serialize(ks, ks.GetType(), jsonOptions));
A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.
Run the following code to list knowledge sources by name and type.
# List knowledge sources by name and type
import requests
import json
endpoint = "{search_url}/knowledgesources"
params = {"api-version": "2025-11-01-preview", "$select": "name, kind"}
headers = {"api-key": "{api_key}"}
response = requests.get(endpoint, params = params, headers = headers)
print(json.dumps(response.json(), indent = 2))
You can also return a single knowledge source by name to review its JSON definition.
# Get a knowledge source definition
import requests
import json
endpoint = "{search_url}/knowledgesources/{knowledge_source_name}"
params = {"api-version": "2025-11-01-preview"}
headers = {"api-key": "{api_key}"}
response = requests.get(endpoint, params = params, headers = headers)
print(json.dumps(response.json(), indent = 2))
A knowledge source is a top-level, reusable object. Knowing about existing knowledge sources is helpful for either reuse or naming new objects.
Use Knowledge Sources - Get (REST API) to list knowledge sources by name and type.
### List knowledge sources by name and type
GET {{search-url}}/knowledgesources?api-version=2025-11-01-preview&$select=name,kind
api-key: {{api-key}}
You can also return a single knowledge source by name to review its JSON definition.
### Get a knowledge source definition
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}?api-version=2025-11-01-preview
api-key: {{api-key}}
Note
Sensitive information is redacted. The generated resources appear at the end of the response.
Check ingestion status
Run the following code to monitor ingestion progress and health, including indexer status for knowledge sources that generate an indexer pipeline and populate a search index.
// Get knowledge source ingestion status
using Azure.Search.Documents.Indexes;
using System.Text.Json;
var indexClient = new SearchIndexClient(new Uri(searchEndpoint), new AzureKeyCredential(apiKey));
// Get the knowledge source status
var statusResponse = await indexClient.GetKnowledgeSourceStatusAsync(knowledgeSourceName);
var status = statusResponse.Value;
// Serialize to JSON for display
var json = JsonSerializer.Serialize(status, new JsonSerializerOptions { WriteIndented = true });
Console.WriteLine(json);
A response for a request that includes ingestion parameters and is actively ingesting content might look like the following example.
{
"synchronizationStatus": "active", // creating, active, deleting
"synchronizationInterval" : "1d", // null if no schedule
"currentSynchronizationState" : { // spans multiple indexer "runs"
"startTime": "2025-10-27T19:30:00Z",
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"lastSynchronizationState" : { // null on first sync
"startTime": "2025-10-27T19:30:00Z",
"endTime": "2025-10-27T19:40:01Z", // this value appears on the activity record on each /retrieve
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"statistics": { // null on first sync
"totalSynchronization": 25,
"averageSynchronizationDuration": "00:15:20",
"averageItemsProcessedPerSynchronization" : 500
}
}
Run the following code to monitor ingestion progress and health, including indexer status for knowledge sources that generate an indexer pipeline and populate a search index.
# Check knowledge source ingestion status
import requests
import json
endpoint = "{search_url}/knowledgesources/{knowledge_source_name}/status"
params = {"api-version": "2025-11-01-preview"}
headers = {"api-key": "{api_key}"}
response = requests.get(endpoint, params = params, headers = headers)
print(json.dumps(response.json(), indent = 2))
A response for a request that includes ingestion parameters and is actively ingesting content might look like the following example.
{
"synchronizationStatus": "active", // creating, active, deleting
"synchronizationInterval" : "1d", // null if no schedule
"currentSynchronizationState" : { // spans multiple indexer "runs"
"startTime": "2025-10-27T19:30:00Z",
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"lastSynchronizationState" : { // null on first sync
"startTime": "2025-10-27T19:30:00Z",
"endTime": "2025-10-27T19:40:01Z", // this value appears on the activity record on each /retrieve
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"statistics": { // null on first sync
"totalSynchronization": 25,
"averageSynchronizationDuration": "00:15:20",
"averageItemsProcessedPerSynchronization" : 500
}
}
Use Knowledge Sources - Status (REST API) to monitor ingestion progress and health, including indexer status for knowledge sources that generate an indexer pipeline and populate a search index.
### Check knowledge source ingestion status
GET {{search-url}}/knowledgesources/{{knowledge-source-name}}/status?api-version=2025-11-01-preview
api-key: {{api-key}}
Content-Type: application/json
A response for a request that includes ingestion parameters and is actively ingesting content might look like the following example.
{
"synchronizationStatus": "active", // creating, active, deleting
"synchronizationInterval" : "1d", // null if no schedule
"currentSynchronizationState" : { // spans multiple indexer "runs"
"startTime": "2025-10-27T19:30:00Z",
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"lastSynchronizationState" : { // null on first sync
"startTime": "2025-10-27T19:30:00Z",
"endTime": "2025-10-27T19:40:01Z", // this value appears on the activity record on each /retrieve
"itemUpdatesProcessed": 1100,
"itemsUpdatesFailed": 100,
"itemsSkipped": 1100,
},
"statistics": { // null on first sync
"totalSynchronization": 25,
"averageSynchronizationDuration": "00:15:20",
"averageItemsProcessedPerSynchronization" : 500
}
}
Review the created objects
When you create a blob knowledge source, your search service also creates an indexer, index, skillset, and data source. We don't recommend that you edit these objects, as introducing an error or incompatibility can break the pipeline.
After you create a knowledge source, the response lists the created objects. These objects are created according to a fixed template, and their names are based on the name of the knowledge source. You can't change the object names.
We recommend using the Azure portal to validate output creation. The workflow is:
- Check the indexer for success or failure messages. Connection or quota errors appear here.
- Check the index for searchable content. Use Search Explorer to run queries.
- Check the skillset to learn how your content is chunked and optionally vectorized.
- Check the data source for connection details. Our example uses API keys for simplicity, but you can use Microsoft Entra ID for authentication and role-based access control for authorization.
Assign to a knowledge base
If you're satisfied with the knowledge source, continue to the next step: specify the knowledge source in a knowledge base.
After the knowledge base is configured, use the retrieve action to query the knowledge source.
Tip
To enforce document-level permissions, set IngestionPermissionOptions when you create this knowledge source, and then include the user's access token in the retrieve request. For more information, see Enforce permissions at query time.
Tip
To enforce document-level permissions, set ingestion_permission_options when you create this knowledge source, and then include the user's access token in the retrieve request. For more information, see Enforce permissions at query time.
Tip
To enforce document-level permissions, set ingestionPermissionOptions when you create this knowledge source, and then include the user's access token in the retrieve request. For more information, see Enforce permissions at query time.
Delete a knowledge source
Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.
If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.
To delete a knowledge source:
Get a list of all knowledge bases on your search service.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); var knowledgeBases = indexClient.GetKnowledgeBasesAsync(); Console.WriteLine("Knowledge Bases:"); await foreach (var kb in knowledgeBases) { Console.WriteLine($" - {kb.Name}"); }An example response might look like the following:
Knowledge Bases: - earth-knowledge-base - hotels-sample-knowledge-base - my-demo-knowledge-baseGet an individual knowledge base definition to check for knowledge source references.
using Azure.Search.Documents.Indexes; using System.Text.Json; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); // Specify the knowledge base name to retrieve string kbNameToGet = "earth-knowledge-base"; // Get a specific knowledge base definition var knowledgeBaseResponse = await indexClient.GetKnowledgeBaseAsync(kbNameToGet); var kb = knowledgeBaseResponse.Value; // Serialize to JSON for display string json = JsonSerializer.Serialize(kb, new JsonSerializerOptions { WriteIndented = true }); Console.WriteLine(json);An example response might look like the following:
{ "Name": "earth-knowledge-base", "KnowledgeSources": [ { "Name": "earth-knowledge-source" } ], "Models": [ {} ], "RetrievalReasoningEffort": {}, "OutputMode": {}, "ETag": "\u00220x8DE278629D782B3\u0022", "EncryptionKey": null, "Description": null, "RetrievalInstructions": null, "AnswerInstructions": null }Either delete the knowledge base or update the knowledge base to remove the knowledge source if you have multiple sources. This example shows deletion.
using Azure.Search.Documents.Indexes; var indexClient = new SearchIndexClient(new Uri(searchEndpoint), credential); await indexClient.DeleteKnowledgeBaseAsync(knowledgeBaseName); System.Console.WriteLine($"Knowledge base '{knowledgeBaseName}' deleted successfully.");Delete the knowledge source.
await indexClient.DeleteKnowledgeSourceAsync(knowledgeSourceName); System.Console.WriteLine($"Knowledge source '{knowledgeSourceName}' deleted successfully.");
Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.
If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.
To delete a knowledge source:
Get a list of all knowledge bases on your search service.
# Get knowledge bases import requests import json endpoint = "{search_url}/knowledgebases" params = {"api-version": "2025-11-01-preview", "$select": "name"} headers = {"api-key": "{api_key}"} response = requests.get(endpoint, params = params, headers = headers) print(json.dumps(response.json(), indent = 2))An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.azure.cn/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
# Get a knowledge base definition import requests import json endpoint = "{search_url}/knowledgebases/{knowledge_base_name}" params = {"api-version": "2025-11-01-preview"} headers = {"api-key": "{api_key}"} response = requests.get(endpoint, params = params, headers = headers) print(json.dumps(response.json(), indent = 2))An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or update the knowledge base to remove the knowledge source if you have multiple sources. This example shows deletion.
# Delete a knowledge base from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_base("knowledge_base_name") print(f"Knowledge base deleted successfully.")Delete the knowledge source.
# Delete a knowledge source from azure.core.credentials import AzureKeyCredential from azure.search.documents.indexes import SearchIndexClient index_client = SearchIndexClient(endpoint = "search_url", credential = AzureKeyCredential("api_key")) index_client.delete_knowledge_source("knowledge_source_name") print(f"Knowledge source deleted successfully.")
Before you can delete a knowledge source, you must delete any knowledge base that references it or update the knowledge base definition to remove the reference. For knowledge sources that generate an index and indexer pipeline, all generated objects are also deleted. However, if you used an existing index to create a knowledge source, your index isn't deleted.
If you try to delete a knowledge source that's in use, the action fails and returns a list of affected knowledge bases.
To delete a knowledge source:
Get a list of all knowledge bases on your search service.
### Get knowledge bases GET {{search-endpoint}}/knowledgebases?api-version=2025-11-01-preview&$select=name api-key: {{api-key}}An example response might look like the following:
{ "@odata.context": "https://my-search-service.search.azure.cn/$metadata#knowledgebases(name)", "value": [ { "name": "my-kb" }, { "name": "my-kb-2" } ] }Get an individual knowledge base definition to check for knowledge source references.
### Get a knowledge base definition GET {{search-endpoint}}/knowledgebases/{{knowledge-base-name}}?api-version=2025-11-01-preview api-key: {{api-key}}An example response might look like the following:
{ "name": "my-kb", "description": null, "retrievalInstructions": null, "answerInstructions": null, "outputMode": null, "knowledgeSources": [ { "name": "my-blob-ks", } ], "models": [], "encryptionKey": null, "retrievalReasoningEffort": { "kind": "low" } }Either delete the knowledge base or update the knowledge base by removing the knowledge source if you have multiple sources. This example shows deletion.
### Delete a knowledge base DELETE {{search-endpoint}}/knowledgebases/{{knowledge-base-name}}?api-version=2025-11-01-preview api-key: {{api-key}}Delete the knowledge source.
### Delete a knowledge source DELETE {{search-endpoint}}/knowledgesources/{{knowledge-source-name}}?api-version=2025-11-01-preview api-key: {{api-key}}