Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this article, you learn how to ingest blobs from your storage account into Azure Data Explorer using an Event Grid data connection. You'll create an Event Grid data connection that sets an Azure Event Grid subscription. The Event Grid subscription routes events from your storage account to Azure Data Explorer via an Azure Event Hubs.
Note
Ingestion supports a maximum file size of 6 GB. The recommendation is to ingest files between 100 MB and 1 GB.
To learn how to create the connection using the Kusto SDKs, see Create an Event Grid data connection with SDKs.
For general information about ingesting into Azure Data Explorer from Event Grid, see Connect to Event Grid.
Note
To achieve the best performance with the Event Grid connection, set the rawSizeBytes
ingestion property via the blob metadata. For more information, see ingestion properties.
- An Azure subscription. Create a Azure account.
- An Azure Data Explorer cluster and database. Create a cluster and database.
- A destination table. Create a table or use an existing table.
- An ingestion mapping for the table.
- A storage account. An Event Grid notification subscription can be set on Azure Storage accounts for
BlobStorage
,StorageV2
, or Data Lake Storage Gen2. - Have the Event Grid resource provider registered.
In this section, you establish a connection between Event Grid and your Azure Data Explorer table.
The following example shows an Azure Resource Manager template for adding an Event Grid data connection. You can edit and deploy the template in the Azure portal by using the form.
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"namespaces_eventhubns_name": {
"type": "string",
"defaultValue": "eventhubns",
"metadata": {
"description": "Specifies the event hub namespace name."
}
},
"EventHubs_eventhubdemo_name": {
"type": "string",
"defaultValue": "eventhubdemo",
"metadata": {
"description": "Specifies the event hub name."
}
},
"consumergroup_default_name": {
"type": "string",
"defaultValue": "$Default",
"metadata": {
"description": "Specifies the consumer group of the event hub."
}
},
"StorageAccounts_storagedemo_name": {
"type": "string",
"defaultValue": "storagedemo",
"metadata": {
"description": "Specifies the storage account name"
}
},
"Clusters_kustocluster_name": {
"type": "string",
"defaultValue": "kustocluster",
"metadata": {
"description": "Specifies the name of the cluster"
}
},
"databases_kustodb_name": {
"type": "string",
"defaultValue": "kustodb",
"metadata": {
"description": "Specifies the name of the database"
}
},
"tables_kustotable_name": {
"type": "string",
"defaultValue": "kustotable",
"metadata": {
"description": "Specifies the name of the table"
}
},
"mapping_kustomapping_name": {
"type": "string",
"defaultValue": "kustomapping",
"metadata": {
"description": "Specifies the name of the mapping rule"
}
},
"dataformat_type": {
"type": "string",
"defaultValue": "csv",
"metadata": {
"description": "Specifies the data format"
}
},
"databaseRouting_type": {
"type": "string",
"defaultValue": "Single",
"metadata": {
"description": "The database routing for the connection. If you set the value to **Single**, the data connection will be routed to a single database in the cluster as specified in the *databaseName* setting. If you set the value to **Multi**, you can override the default target database using the *Database* EventData property."
}
},
"dataconnections_kustodc_name": {
"type": "string",
"defaultValue": "kustodc",
"metadata": {
"description": "Name of the data connection to create"
}
},
"subscriptionId": {
"type": "string",
"defaultValue": "[subscription().subscriptionId]",
"metadata": {
"description": "Specifies the subscriptionId of the resources"
}
},
"resourceGroup": {
"type": "string",
"defaultValue": "[resourceGroup().name]",
"metadata": {
"description": "Specifies the resourceGroup of the resources"
}
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "Location for all resources."
}
}
},
"variables": {
},
"resources": [{
"type": "Microsoft.Kusto/Clusters/Databases/DataConnections",
"apiVersion": "2022-02-01",
"name": "[concat(parameters('Clusters_kustocluster_name'), '/', parameters('databases_kustodb_name'), '/', parameters('dataconnections_kustodc_name'))]",
"location": "[parameters('location')]",
"kind": "EventGrid",
"properties": {
"managedIdentityResourceId": "[resourceId('Microsoft.Kusto/clusters', parameters('clusters_kustocluster_name'))]",
"storageAccountResourceId": "[resourceId(parameters('subscriptionId'), parameters('resourceGroup'), 'Microsoft.Storage/storageAccounts', parameters('StorageAccounts_storagedemo_name'))]",
"eventHubResourceId": "[resourceId(parameters('subscriptionId'), parameters('resourceGroup'), 'Microsoft.EventHub/namespaces/eventhubs', parameters('namespaces_eventhubns_name'), parameters('EventHubs_eventhubdemo_name'))]",
"consumerGroup": "[parameters('consumergroup_default_name')]",
"tableName": "[parameters('tables_kustotable_name')]",
"mappingRuleName": "[parameters('mapping_kustomapping_name')]",
"dataFormat": "[parameters('dataformat_type')]",
"databaseRouting": "[parameters('databaseRouting_type')]"
}
}
]
}
This section shows how to trigger ingestion from Azure Blob Storage or Azure Data Lake Gen 2 to your cluster following blob creation or blob renaming.
Select the relevant tab based on the type of storage SDK used to upload blobs.
The following code sample uses the Azure Blob Storage SDK to upload a file to Azure Blob Storage. The upload triggers the Event Grid data connection, which ingests the data into Azure Data Explorer.
var azureStorageAccountConnectionString = <storage_account_connection_string>;
var containerName = <container_name>;
var blobName = <blob_name>;
var localFileName = <file_to_upload>;
var uncompressedSizeInBytes = <uncompressed_size_in_bytes>;
var mapping = <mapping_reference>;
// Create a new container if it not already exists.
var azureStorageAccount = new BlobServiceClient(azureStorageAccountConnectionString);
var container = azureStorageAccount.GetBlobContainerClient(containerName);
container.CreateIfNotExists();
// Define blob metadata and uploading options.
IDictionary<String, String> metadata = new Dictionary<string, string>();
metadata.Add("rawSizeBytes", uncompressedSizeInBytes);
metadata.Add("kustoIngestionMappingReference", mapping);
var uploadOptions = new BlobUploadOptions
{
Metadata = metadata,
};
// Upload the file.
var blob = container.GetBlobClient(blobName);
blob.Upload(localFileName, uploadOptions);
Note
Azure Data Explorer won't delete the blobs post ingestion. Retain the blobs for three to five days by using Azure Blob storage lifecycle to manage blob deletion.
Note
Triggering ingestion following a CopyBlob
operation is not supported for storage accounts that have the hierarchical namespace feature enabled on them.
Important
We highly discourage generating Storage Events from custom code and sending them to Event Hubs. If you choose to do so, make sure that the events produced strictly adhere to the appropriate Storage Events schema and JSON format specifications.
To remove the Event Grid connection from the Azure portal, do the following steps:
- Go to your cluster. From the left menu, select Databases. Then, select the database that contains the target table.
- From the left menu, select Data connections. Then, select the checkbox next to the relevant Event Grid data connection.
- From the top menu bar, select Delete.