Vector search in Azure Cosmos DB for NoSQL

Applies to: ✅ NoSQL

Azure Cosmos DB for NoSQL now offers efficient vector indexing and search. This feature is designed to handle multi-modal, high-dimensional vectors, enabling efficient and accurate vector search at any scale. You can now store vectors directly in the documents alongside your data. Each document in your database can contain not only traditional schema-free data, but also multi-modal high-dimensional vectors as other properties of the documents. This colocation of data and vectors allows for efficient indexing and searching, as the vectors are stored in the same logical unit as the data they represent. Keeping vectors and data together simplifies data management, AI application architectures, and the efficiency of vector-based operations.

Azure Cosmos DB for NoSQL offers flexibility by letting you choose the vector indexing method:

A flat or k-nearest neighbors exact search (sometimes called brute-force) can provide 100% retrieval recall for smaller, focused vector searches. especially when combined with query filters and partition-keys.
A quantized flat index that compresses vectors using DiskANN-based quantization methods for better efficiency in the kNN search.
DiskANN, a suite of state-of-the-art vector indexing algorithms developed by Microsoft Research to power efficient, high accuracy multi-modal vector search at any scale.

To learn more about vector indexing, see Vector indexes.

Vector search in Azure Cosmos DB can be combined with all other supported Azure Cosmos DB NoSQL query filters and indexes by using WHERE clauses. This enables your vector searches to provide the most relevant data for your applications.

This feature enhances the core capabilities of Azure Cosmos DB, making it more versatile for handling vector data and search requirements in AI applications.

Note

Interested in ultra-high throughput vector search capabilities? Azure Cosmos DB is developing enhanced vector search features designed for large vector datasets paired with ultra-high throughput inserts and searches. It can accommodate millions of queries per second (QPS) with predictable, low latency and unmatched cost efficiency. Sign up to learn more about early access opportunities and get notified when these capabilities become available.

Sign up for the expanded Private Preview.

What is a vector store?

A vector store or vector database] is a database designed to store and manage vector embeddings, which are mathematical representations of data in a high-dimensional space. In this space, each dimension corresponds to a feature of the data, and tens of thousands of dimensions might be used to represent sophisticated data. A vector's position in this space represents its characteristics. Words, phrases, or entire documents, and images, audio, and other types of data can all be vectorized.

How does a vector store work?

In a vector store, vector search algorithms are used to index and query embeddings. Some well-known vector search algorithms include Hierarchical Navigable Small World (HNSW), Inverted File (IVF), and DiskANN. Vector search is a method that helps you find similar items based on their data characteristics rather than by exact matches on a property field.

This technique is useful in applications such as searching for similar text, finding related images, making recommendations, or even detecting anomalies.

Vector search measures the distance between the data vectors and your query vector. The data vectors that are closest to your query vector are the ones that are found to be most similar semantically.

In the integrated vector database in Azure Cosmos DB for NoSQL, embeddings can be stored, indexed, and queried alongside the original data. This approach eliminates the extra cost of replicating data in a separate pure vector database. Moreover, this architecture keeps the vector embeddings and original data together, which better facilitates multi-modal data operations, and enables greater data consistency, scale, and performance.

Enable the vector indexing and search feature

To enable this feature for Azure Cosmos DB for NoSQL, follow these steps:

Go to your Azure Cosmos DB for NoSQL resource page.
On the left pane, under Settings, select Features.
Select Vector Search for NoSQL API.
Read the description of the feature to confirm that you want to enable it.
Select Enable to turn on vector search in Azure Cosmos DB for NoSQL.

Tip

Alternatively, use the Azure CLI to update the capabilities of your account to support NoSQL vector search.

az cosmosdb update \
     --resource-group <resource-group-name> \
     --name <account-name> \
     --capabilities EnableNoSQLVectorSearch

The registration request is autoapproved, but it might take 15 minutes to take effect.

Container vector policies

Performing vector search with Azure Cosmos DB for NoSQL requires you to define a vector policy for the container. This provides essential information for the database engine to conduct efficient similarity search for vectors found in the container's documents. This also informs the vector indexing policy of necessary information, should you choose to specify one. The following information is included in the contained vector policy:

path: The property path that contains vectors (required).
datatype: The data type of the vector property. Supported types are float32, float16, int8, and uint8.
dimensions: The dimensionality or length of each vector in the path. All vectors in a path should have the same number of dimensions. The default is 1536.
distanceFunction: The metric used to compute distance/similarity. Supported metrics are:
- cosine, which has values from -1 (least similar) to +1 (most similar).
- dotproduct, which has values from -inf (least similar) to +inf (most similar).
- euclidean, which has values from 0 (most similar) to +inf) (least similar).

Note

Each unique path can have at most one policy. However, multiple policies can be specified if they all target a different path.

Note

Many embedding models represent elements of a vector using float32. Using float16 instead can reduce the storage footprint of vectors by 50%, however some reduction in accuracy may result.

The container vector policy can be described as JSON objects. Here are two examples of valid container vector policies:

A policy with a single vector path

{
    "vectorEmbeddings": [
        {
            "path":"/vector1",
            "dataType":"float32",
            "distanceFunction":"cosine",
            "dimensions":1536
        }
    ]
}

A policy with two vector paths

{
    "vectorEmbeddings": [
        {
            "path":"/vector1",
            "dataType":"float32",
            "distanceFunction":"cosine",
            "dimensions":1536
        },
        {
            "path":"/vector2",
            "dataType":"float16",
            "distanceFunction":"dotproduct",
            "dimensions":100
        }
    ]
}

Vector indexing policies

Vector indexes increase the efficiency when performing vector searches using the VectorDistance system function. Vectors searches have lower latency, higher throughput, and less RU consumption when using a vector index. You can specify the following types of vector index policies:

Type	Description	Max dimensions
`flat`	Stores vectors on the same index as other indexed properties.	505
`quantizedFlat`	Quantizes (compresses) vectors before storing on the index. This can improve latency and throughput at the cost of a small amount of accuracy.	4096
`diskANN`	Creates an index based on DiskANN for fast and efficient approximate search.	4096

Note

The quantizedFlat and diskANN indexes require that at least 1,000 vectors are inserted. This is to ensure accuracy of the quantization process. If there are fewer than 1,000 vectors, a full scan is executed instead.

A few points to consider:

The flat and quantizedFlat index types use Azure Cosmos DB's index to store and read each vector when performing a vector search. Vector searches with a flat index are brute-force searches and produce 100% accuracy or recall. That is, it's guaranteed to find the most similar vectors in the dataset. However, there's a limitation of 505 dimensions for vectors on a flat index.
The quantizedFlat index stores quantized (compressed) vectors on the index. Vector searches with quantizedFlat index are also brute-force searches, however their accuracy might be slightly less than 100% since the vectors are quantized before adding to the index. However, vector searches with quantized flat should have lower latency, higher throughput, and lower RU cost than vector searches on a flat index. This is a good option for smaller scenarios, or scenarios where you're using query filters to narrow down the vector search to a relatively small set of vectors. quantizedFlat is recommended when the number of vectors scoped to your search is 50,000 or fewer. However, this is just a general guideline and actual performance should be tested as each scenario can be different.
The diskANN index is a separate index defined specifically for vectors using DiskANN, a suite of high performance vector indexing algorithms developed by Microsoft Research. DiskANN indexes can offer some of the lowest latency, highest throughput, and lowest RU cost queries, while still maintaining high accuracy. In general, DiskANN is the most performant of all index types if your search query is scoped to more than 50,000 vectors.

Note

You can add new path configurations or remove existing ones, but you cannot change the settings of a vector embedding policy or vector indexing policy directly. To do so you must first drop the existing vector policy and/or index, then add it back with new configuration.

Here are examples of valid vector index policies:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/_etag/?"
        },
        {
            "path": "/vector1/*"
        }
    ],
    "vectorIndexes": [
        {
            "path": "/vector1",
            "type": "diskANN"
        }
    ]
}

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/_etag/?"
        }
    ],
    "vectorIndexes": [
        {
            "path": "/vector1",
            "type": "quantizedFlat"
        },
        {
            "path": "/vector2",
            "type": "diskANN"
        }
    ]
}

Important

Wild card characters (*, []) and vector paths nested inside arrays aren't currently supported in the vector policy or vector index.

You can also optionally configure a quantizerType within each vectorIndexes entry. This controls how vectors are quantized prior to indexing.

product (default) Uses standard product quantization. Provides balanced performance and accuracy for most workloads.
spherical (public preview) This quantization method can reduce quantization time, leading to slightly faster indexing times and improved performance. This method can also provide higher and more stable recall over time with very high dimensional embeddings. Currently available in public preview.

An example of how to define the quantizerType is shown below:

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/_etag/?"
        }
    ],
    "vectorIndexes": [
        {
            "path": "/vector1",
            "type": "quantizedFlat",
            "quantizerType": "spherical"
        },
        {
            "path": "/vector2",
            "type": "diskANN",
            "quantizerType": "product"
        }
    ]
}

If not specified, product is used by default.

Perform vector search with queries using VectorDistance

Once you created a container with the desired vector policy, and inserted vector data into the container, you can conduct a vector search using the VectorDistance system function in a query. The following example shows a NoSQL query that projects the similarity score as the alias SimilarityScore, and sorts in order of most-similar to least-similar:

SELECT TOP 10 c.title, VectorDistance(c.contentVector, [1,2,3]) AS SimilarityScore   
FROM c  
ORDER BY VectorDistance(c.contentVector, [1,2,3])

Important

Always use a TOP N clause in the SELECT statement of a query. Otherwise the vector search tries to return many more results and the query costs more RUs and have higher latency than necessary.

Current limitations

Vector indexing and search in Azure Cosmos DB for NoSQL has some limitations.

quantizedFlat and diskANN indexes require at least 1,000 vectors to be indexed to ensure that the quantization is accurate. If fewer than 1,000 vectors are indexed, then a full-scan is used instead and RU charges might be higher.
Vectors indexed with the flat index type can be at most 505 dimensions. Vectors indexed with the quantizedFlat or DiskANN index type can be at most 4,096 dimensions.
The rate of vector insertions should be limited. Very large ingestion (in excess of 5M vectors) in a short period of time might require more index build time.
At this time, vector indexing and search aren't supported on accounts with Shared Throughput.
Once vector indexing and search are enabled on a container, it can't be disabled.

Last updated on 2026-06-15

Vector search in Azure Cosmos DB for NoSQL

What is a vector store?

How does a vector store work?

Enable the vector indexing and search feature

Container vector policies

A policy with a single vector path

A policy with two vector paths

Vector indexing policies

Perform vector search with queries using VectorDistance

Current limitations

Related content

Additional resources