Frequently asked questions about Azure Synapse Link for Azure Cosmos DB

APPLIES TO: NoSQL MongoDB Gremlin

Azure Synapse Link for Azure Cosmos DB creates a tight integration between Azure Cosmos DB and Azure Synapse Analytics. It enables customers to run near real-time analytics over their operational data with full performance isolation from their transactional workloads and without an ETL pipeline. This article answers commonly asked questions about Synapse Link for Azure Cosmos DB.

General FAQ

Azure Synapse Link is supported for the Azure Cosmos DB APIs for NoSQL, Gremlin, and MongoDB. The support for Azure Cosmos DB API for Gremlin is now in preview.

Yes, for multi-region Azure Cosmos DB accounts, the data stored in the analytical store is also multiple-regionally distributed. Analytical store will exist in all regions where you also have transactional store. Regardless of single write region or multiple write regions, analytical queries performed from Azure Synapse Analytics can be served from the closest local region.

When Azure Synapse Link is enabled for a multi-region account, analytical store is created in all regions chosen by customers for transactional geo-replication. The underlying data is optimized for throughput and transactional consistency in the transactional store.

Is analytical store supported in all Azure Cosmos DB regions?

Yes.

Currently, after the Synapse Link capability is enabled at the account level, you can't disable it. There are no billing implications if the Synapse Link capability is enabled at the account level and there's no analytical store enabled containers.

If you need to turn off the capability, delete and re-create a new Azure Cosmos DB account, migrating the data if necessary.

Yes, you can use Azure CLI or PowerShell to set analytical TTL to 0, what will turn off Synapse Link from the container and permanently delete analytical store. Please note that currently this action can't be undone and will block the migration of the database account to continuous backup.

Does analytical store have any impact on Azure Cosmos DB transactional SLAs?

No, there's no impact.

Yes, for both the APIs for MongoDB and NoSQLs database accounts. Use CLI or PowerShell for MongoDB accounts.

You need Contributor role to enable Synapse Link at account level.

Why Synapse Workspace doesn't list my Gremlin Graphs in Data Explorer?

Data Explorer in Synapse Workspaces doesn't support Gremlin graphs in the tree view. But you can still run queries.

Azure Cosmos DB analytical store

Can I enable analytical store on existing containers?

Yes. Currently you can use Azure portal, Azure CLI, PowerShell, or Azure Cosmos DB SDKs to enable analytical store for existing API for NoSQL containers. And you can use Azure CLI or PowerShell for existing API for MongoDB collections.

Can I see analytical store files using Azure Data Explorer?

No. Analytical store is persisted in a storage account located in a Cosmos DB internal subscription. Customers don't have access to this storage account and have to use Azure Synapse runtimes to read the data.

Can I disable analytical store on my Azure Cosmos DB containers?

Yes, analytical store can be disabled in API for NoSQL containers and in API for MongoDB collections, using PowerShell or CLI. Currently this action can't be undone.

Is analytical store supported for Azure Cosmos DB containers with autoscale provisioned throughput?

Yes, the analytical store can be enabled on containers with autoscale provisioned throughput.

Is there any effect on Azure Cosmos DB transactional store provisioned RUs?

Azure Cosmos DB guarantees performance isolation between the transactional and analytical workloads. Enabling the analytical store on a container doesn't impact the Azure Cosmos DB RU/s. The transactions (read & write) and storage costs for the analytical store are charged separately. See the pricing for Azure Cosmos DB analytical store for more details.

Can I restrict network access to Azure Cosmos DB analytical store?

Yes you can configure a managed private endpoint and restrict network access of analytical store to Azure Synapse managed virtual network. Managed private endpoints establish a private link to your analytical store.

You can add both transactional store and analytical store private endpoints to the same Azure Cosmos DB account in an Azure Synapse Analytics workspace. If you only want to run analytical queries, you may only want to enable the analytical private endpoint in Synapse Analytics workspace.

Can I use customer-managed keys with the Azure Cosmos DB analytical store?

You can seamlessly encrypt the data across transactional and analytical stores using the same customer-managed keys in an automatic and transparent manner. To use customer-managed keys with the analytical store, you need to use your Azure Cosmos DB account's system-assigned managed identity in your Azure Key Vault access policy. You should then be able to enable the analytical store on your account. Click here for more information.

Are delete and update operations on the transactional store reflected in the analytical store?

Yes, deletes and updates to the data in the transactional store are reflected in the analytical store. You can configure the Time to Live (TTL) on the container to include historical data so that the analytical store retains all versions of items that satisfy the analytical TTL criteria. See the overview of analytical TTL for more details.

Can I connect to analytical store from analytics engines other than Azure Synapse Analytics?

You can only access and run queries against the analytical store using the various run-times provided by Azure Synapse Analytics. The analytical store can be queried and analyzed using:

  • Synapse Spark with full support for Scala, Python, SparkSQL, and C#. Synapse Spark is central to data engineering and science scenarios
  • Serverless SQL pool with T-SQL language and support for familiar BI tools (For example, Power BI Premium, etc.)

Can I connect to analytical store from Synapse SQL provisioned?

At this time, the analytical store can't be accessed from Synapse SQL provisioned.

Can I write back the query aggregation results from Synapse back to the analytical store?

No, analytical store is read-only.

Is the autosync replication from transactional store to the analytical store asynchronous or synchronous and what are the latencies?

Auto-sync latency is usually within 2 minutes. In cases of shared throughput database with a large number of containers, auto-sync latency of individual containers could be higher and take up to 5 minutes.

Are there any scenarios where the items from the transactional store are not automatically propagated to the analytical store?

If specific items in your container violate the well-defined schema for analytics, they're not included in analytical store.

Can I partition the data in analytical store differently from transactional store?

By default, analytical store isn't partitioned. If your analytical queries have frequently used filters, use custom partitioning for better performance. Click here for more information.

Can I customize or override the way transactional data is transformed into columnar format in the analytical store?

Currently you can’t transform the data items when they're automatically propagated from the transactional store to analytical store. If you have scenarios blocked by this limitation, email the Azure Cosmos DB team.

Can I access analytical store with Azure Cosmos DB SDKs?

No, you can't access analytical store with the Azure Cosmos DB SDKs. You need to use Azure Synapse Analytics, Spark or SQL serverless pools.

Can I access analytical store with Azure Cosmos DB REST APIs?

No, you can't access analytical store with the Azure Cosmos DB REST APIs. You need to use Azure Synapse Analytics, Spark or SQL serverless pools.

Is analytical store supported by Terraform?

Currently Terraform doesn’t support analytical store containers. Check Terraform GitHub Issues for more information.

You need at least Operator role to enable Synapse Link, for consequence to enable analytical store, at container or collection level.

Analytical Time to live (TTL)

Is TTL for analytical data supported at both container and item level?

At this time, TTL for analytical data can only be configured at container level and there's no support to set analytical TTL at item level.

After setting the container level analytical TTL on an Azure Cosmos DB container, can I change to a different value later?

Yes, analytical TTL can be updated to any valid value. See the Analytical TTL article for more details about analytical TTL.

Can I update or delete an item from the analytical store after it has been TTL’d out from the transactional store?

All transactional updates and deletes are copied to the analytical store but if the item has been purged from the transactional store, then it can't be updated in the analytical store. To learn more, see the Analytical TTL article.

Billing

The billing model of Azure Synapse Link includes the costs incurred by using the Azure Cosmos DB analytical store and the Synapse runtime. To learn more, see the Azure Cosmos DB analytical store pricing and Azure Synapse Analytics pricing articles.

None. Charges only occur when you create an analytical store enabled container and start to load data.

Security

What are the ways to authenticate with the analytical store?

Authentication with the analytical store is the same as a transactional store. For a given database, you can authenticate with the primary or read-only key. You can use linked services in Azure Synapse Studio to prevent pasting the Azure Cosmos DB keys in the Spark notebooks. Access to this Linked Service is available for everyone who has access to the workspace. When using Synapse serverless SQL pools, you can query the Azure Cosmos DB analytical store by pre-creating and referencing SQL credentials in the OPENROWSET function. To learn more, see Query with a serverless SQL pool in Azure Synapse Link article.

Yes, Azure Synapse Link supports configuring customer-managed keys using your Azure Cosmos DB account's managed identity. You can seamlessly encrypt the data across transactional and analytical stores using the same customer-managed keys in an automatic and transparent manner. To learn more, see configuring customer-managed keys using Azure Cosmos DB accounts' managed identities article.

Yes, You can control network access to the data in the transactional and analytical stores independently. Network isolation is done using separate managed private endpoints for each store, within managed virtual networks in Azure Synapse workspaces. To learn more, see how to Configure private endpoints for analytical store article.

Usually, 403s are due to network/firewall settings that prevent users from accessing specific data, even from the portal. The most common cause for this problem is that step 1 of the enable Network isolation process using private endpoints hasn't been performed with Azure CLI or PowerShell. More details here.

Synapse run-times

What are the currently supported Synapse run-times to access Azure Cosmos DB analytical store?

Azure Synapse runtime Current support
Azure Synapse Spark pools Read, Write (through transactional store), Table, Temporary View
Azure Synapse serverless SQL pool Read, View
Azure Synapse SQL Provisioned Not available

Do Spark tables sync with SQL Serverless tables the same way they do with Azure Data Lake?

Currently, this feature isn't available.

Can I do Spark structured streaming from analytical store?

Currently Spark structured streaming support for Azure Cosmos DB is implemented using the change feed functionality of the transactional store and it’s not yet supported from analytical store.

Is streaming supported?

We don't support streaming of data from the analytical store.

Azure Synapse Studio

In the Azure Synapse Studio, how do I recognize if I'm connected to an Azure Cosmos DB container with the analytics store enabled?

An Azure Cosmos DB container enabled with analytical store has the following icon:

Azure Cosmos DB container enabled with analytical store- icon

A transactional store container is represented with the following icon:

Azure Cosmos DB container enabled with transactional store- icon

How do you pass Azure Cosmos DB credentials from Azure Synapse Studio?

Currently Azure Cosmos DB credentials are passed while creating the linked service by the user who has access to the Azure Cosmos DB databases. Access to that store is available to other users who have access to the workspace.

Can I use SQL Server Management Studio to query analytical store using Synapse Serverless SQL pool?

Yes.