Q: How can I monitor ingestion issues?

You can monitor ingestion using metrics , and by setting up and using ingestion diagnostic logs for detailed table-level monitoring, viewing detailed ingestion error codes, and so on. You can select specific metrics to track, choose how to aggregate your results, and create metric charts to view on your dashboard. See more about streaming metrics and how to monitor queued ingestion .

Question 1

How does queued ingestion affect my data?

Accepted Answer

The batching manager buffers and batches ingress data based on the ingestion settings in the ingestion batching policy. The ingestion batching policy sets batch limits according to three limiting factors, whichever is first reached: time elapsed since batch creation, accumulated number of items (blobs), or total batch size. The default batching settings are 5 minutes / 1 GB / 1,000 blobs, meaning there will be at least a 5-minute delay when queueing a sample data for ingestion.

Question 2

Should I use queued or streaming ingestion?

Accepted Answer

Queued ingestion is optimized for high ingestion throughput, and is the preferred and most performant type of ingestion. In contrast, streaming ingestion is optimized for low ingestion latency. Learn more about queued versus streaming ingestion.

Question 3

Do I need to change the batching policy?

Accepted Answer

If default settings for the ingestion batching policy don't suit your needs, you can try lowering the batching policy time. See Optimize for throughput. You should also update settings when you scale up ingestion. When you change batching policy settings, it can take up to 5 minutes to take effect.

Question 4

What causes queued ingestion latency?

Accepted Answer

Ingestion latency can result from the ingestion batching policy settings, or a data backlog buildup. To address this, adjust the batching policy settings. Latencies that are part of the ingestion process can be monitored.

Question 5

Where can I view queued ingestion latency metrics?

Accepted Answer

To view queued ingestion latency metrics, see monitoring ingestion latency. The metrics Stage Latency and Discovery Latency show latencies in the ingestion process, and reveal if there are any long latencies.

Question 6

How can I shorten queued ingestion latencies?

Accepted Answer

You can learn about latencies and adjust settings in the batching policy for addressing issues that cause latencies such as data backlogs, inefficient batching, batching large amounts of uncompressed data, or ingesting very small amounts of data.

Question 7

How is batching data size calculated?

Accepted Answer

The batching policy data size is set for uncompressed data. When ingesting compressed data, the uncompressed data size is calculated from ingestion batching parameters, ZIP files metadata, or factor over the compressed file size.

Question 8

How can I monitor ingestion issues?

Accepted Answer

You can monitor ingestion using metrics, and by setting up and using ingestion diagnostic logs for detailed table-level monitoring, viewing detailed ingestion error codes, and so on. You can select specific metrics to track, choose how to aggregate your results, and create metric charts to view on your dashboard. See more about streaming metrics and how to monitor queued ingestion.

Question 9

Where can I view insights about ingestion?

Accepted Answer

You can use the portal's Azure Monitor Insights to help you understand how Azure Data Explorer is performing and how it's being used. The Insight view is based on metrics and diagnostic logs that can be streamed to a Log Analytics workspace. Use the .dup-next-ingest command to duplicate the next ingestion into a storage container and review the details and metadata of the ingestion.

Question 10

Where do I check ingestion errors?

Accepted Answer

The full ingestion process can be monitored using ingestion metrics and diagnostic logs. Ingestion failures can be monitored using the IngestionResult metric or the FailedIngestion diagnostic log. The .show ingestion failures command shows ingestion failures associated with the data ingestion management commands, and isn't recommended for monitoring errors. The .dup-next-failed-ingest command provides information on the next failed ingestion by uploading ingestion files and metadata to a storage container. This can be useful for checking an ingestion flow, though isn't advised for steady monitoring.

Question 11

What can I do if I find many retry errors?

Accepted Answer

Metrics that include the RetryAttemptsExceeded metric status many times indicate that ingestion exceeded the retry attempt limit or time-span limit following a recurring transient error. If this error also appears in the diagnostic log with error code General_RetryAttemptsExceeded and the details "Failed to access storage and get information for the blob," this indicates a high load storage access issue. During Event Grid ingestion, Azure Data Explorer requests blob details from the storage account. When the load is too high on a storage account, storage access may fail, and information needed for ingestion can't be retrieved. If attempts pass the maximum amount of retries defined, Azure Data Explorer stops trying to ingest the failed blob. To prevent a load issue, use a premium storage account or divide the ingested data over more storage accounts. To discover related errors, check the FailedIngestion diagnostic logs for error codes and for the paths of any failed blobs.

Question 12

How can I ingest large amounts of historical data and ensure good performance?

Accepted Answer

To efficiently ingest large quantities of historical data, use LightIngest. For more information, see ingest historical data. To improve performance for many small files, adjust the batching policy, change batching conditions and address latencies. To improve ingestion performance when ingesting extremely large data files, use Azure Data Factory (ADF), a cloud-based data integration service.

Question 13

What happens when invalid data is ingested?

Accepted Answer

Malformed data, unparsable, too large or not conforming to schema, might fail to be ingested properly. For more information, see Ingestion of invalid data.

Question 14

How can I improve ingestion with SDKs?

Accepted Answer

When ingesting via SDK, you can use the ingestion batching policy settings to improve performance. Try incrementally decreasing the size of data ingested in the table or database batching policy down towards 250 MB. Check if there's an improvement.

Question 15

How can I tune Kusto Kafka Sink for better ingestion performance?

Accepted Answer

Kafka Sink users should tune the connector to work together with the ingestion batching policy by tuning batching time, size, and item number.

Queued ingestion and data latencies

How does queued ingestion affect my data?

Should I use queued or streaming ingestion?

Do I need to change the batching policy?

What causes queued ingestion latency?

Where can I view queued ingestion latency metrics?

How can I shorten queued ingestion latencies?

How is batching data size calculated?

Ingestion monitoring, metrics, and errors

How can I monitor ingestion issues?

Where can I view insights about ingestion?

Where do I check ingestion errors?

What can I do if I find many retry errors?

Ingesting historical data

How can I ingest large amounts of historical data and ensure good performance?

Ingesting invalid data

What happens when invalid data is ingested?

SDKs and connectors

How can I improve ingestion with SDKs?

How can I tune Kusto Kafka Sink for better ingestion performance?

Common questions about Azure Data Explorer ingestion

Queued ingestion and data latencies

How does queued ingestion affect my data?

Should I use queued or streaming ingestion?

Do I need to change the batching policy?

What causes queued ingestion latency?

Where can I view queued ingestion latency metrics?

How can I shorten queued ingestion latencies?

How is batching data size calculated?

Ingestion monitoring, metrics, and errors

How can I monitor ingestion issues?

Where can I view insights about ingestion?

Where do I check ingestion errors?

What can I do if I find many retry errors?

Ingesting historical data

How can I ingest large amounts of historical data and ensure good performance?

Ingesting invalid data

What happens when invalid data is ingested?

SDKs and connectors

How can I improve ingestion with SDKs?

How can I tune Kusto Kafka Sink for better ingestion performance?

Additional resources