Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This feature is currently in public preview. This preview is provided without a service-level agreement and isn't recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Azure Previews.
In agentic retrieval, you can specify the level of large language model (LLM) processing for query planning and answer formulation. Use the retrievalReasoningEffort property to set LLM processing levels that affect costs and latency. Extra LLM processing improves relevancy, but it also takes longer and uses billable LLM resources. You can set this property in a knowledge base or on a retrieve request.
Levels of reasoning effort include:
| Level | Effort |
|---|---|
minimal |
No LLM processing. You provide the query. |
low |
Runs a single pass of LLM-based query planning and knowledge source selection. This is the default. The LLM analyzes the query and breaks it into component parts as needed. |
medium |
Adds deeper search and an enhanced retrieval stack to agentic retrieval to maximize completeness. |
Prerequisites
Azure AI Search in any region that provides agentic retrieval.
Familiarity with agentic retrieval concepts and workflow.
A knowledge base that uses the 2025-11-01-preview syntax.
Visual Studio Code with the REST Client extension or a preview Azure SDK package that provides the knowledge base REST APIs.
Set the reasoning effort in a knowledge base
To establish the default behavior, set the property in the knowledge base.
Use Create or Update Knowledge Base to set the
retrievalReasoningEffort.Add the
retrievalReasoningEffortproperty. The following JSON shows the syntax. For more information about knowledge bases, see Create a knowledge base."retrievalReasoningEffort": { /* no other parameters when effort is minimal */ "kind": "low" }
Set the reasoning effort in a retrieve request
To override the default on a query-by-query basis, set the property in the retrieve request.
Modify a retrieve action to override the knowledge base
retrievalReasoningEffortdefault.Add the
retrievalReasoningEffortproperty. A retrieve request might look similar to the following example.{ "messages": [ /* trimmed for brevity */ ], "retrievalReasoningEffort": { "kind": "low" }, "outputMode": "answerSynthesis", "maxRuntimeInSeconds": 30, "maxOutputSize": 6000 }