Choose the right integration runtime configuration for your scenario
The integration runtime (IR) is the compute infrastructure that Microsoft Purview uses to power data scan across different network environments. This article introduces the different types of integration runtime available in Microsoft Purview, and provides guidance on how to choose the right integration runtime configuration for your scenario.
Types of integration runtimes
Microsoft Purview provides the following types of integration runtimes:
- Azure integration runtime: The Azure integration runtime is a fully managed and elastic compute that you can use to scan Azure or non-Azure data sources. The Azure IR supports connections to data stores and compute services with publicly accessible endpoints. It's the default integration runtime that you don't need to create anything to get started.
- Self-hosted integration runtime: The self-hosted integration runtime can be used to scan data sources in an on-premises network or a virtual network. You can install it on an on-premises machine or a virtual machine inside your private network. Learn more from Create and manage Self-hosted Integration Runtimes.
- Kubernetes supported self-hosted integration runtime (Preview): This integration runtime is hosted on a Kubernetes cluster and can be used to scan data sources in an on-premises network or a virtual network. Kubernetes support improves overall performance and allows the integration runtime to scale with the job. Learn more from Create and manage Kubernetes supported self-hosted integration runtimes
Choose the right integration runtime
It's important to choose an appropriate type of integration runtime. Not only must it be suitable for your existing architecture and requirements for data integration, but you also need to consider how to further meet growing business needs and any future increase in workload.
The following consideration can help you navigate the decision:
What data source types do you want to scan?
Check the supported data sources section to learn about the supported IR types for the data sources you want to scan.
What’s the network access control on your data source?
Different data source may have different network firewall settings to protect it from random access over the internet, may it be an on-premises or a cloud / SaaS data store. The following table lists some common firewall options. You can choose the supported IR type according to your scenario.
Data source firewall Supported by Azure IR Supported by SHIR Allow public access ✓ ✓ Allow Azure service or trusted service ✓ Allow access from specific Azure virtual network ✓ Allow specific IP / IP range ✓ Other on-premises or private network access ✓ What’s the firewall setting of your Microsoft Purview?
Microsoft Purview provides different network firewall options. Learn more from Configure Microsoft Purview firewall. You can choose the supported IR type according to your scenario.
Purview firewall Supported by Azure IR Supported by SHIR Enabled from all networks ✓ ✓ Disabled from all networks ✓ (need to create private endpoint from your network) What level of security do you require during data transmission?
The integration runtime location defines the location of its back-end compute and where the scan operations are performed. For data residency consideration:
- When you use Azure IR, Microsoft Purview automatically detects data source's location and uses the IR in that region. If Microsoft Purview can't detect the region, it uses Purview account's region.
- When you use Managed VNet IR, it runs in the region you configure for the managed virtual network.
- When you use SHIR, you can fully decide the location in your on-premises or Azure virtual machines.
To defend against, for example, man-in-the-middle attacks during data transmission, you can choose to use a Private Endpoint and Private Link to ensure data security.
- You can create managed private endpoints to your data stores when using Managed VNet IR. The private endpoints are maintained by the Microsoft Purview service within the managed virtual network.
- You can also create private endpoints in your virtual network and the SHIR can use them to access data stores.
What level of maintenance are you able to provide?
Maintaining infrastructure, servers, and equipment is one of the important tasks of the IT department of an enterprise. It usually takes a lot of time and effort.
- When using Azure IR and Managed VNet IR, you don’t need to worry about the maintenance such as update, patch and version. The Microsoft Purview service takes care of all the maintenance efforts.
- Because the SHIR is installed on your machines and the Kubernetes supported SHIR is on your Kubernetes clusters, you need to manage the maintenance.
- The SHIR supports auto-update to automatically get the latest version whenever there's' an update. Learn more from Self-hosted integration runtime auto-update and expiration.
- Currently, the Kubernetes supported self-hosted integration runtime only supports manual updates.
Performance and scalability
We recommend you to use the fully managed and autoscaled Azure IR, Managed VNet IR, or the Kubernetes-supported self-hosted integration runtime whenever applicable. With the elasticity, they can provide you with better performance and scalability especially when scanning large-scale data systems.
Supported data sources
The table below shows all the data sources that are supported by Microsoft Purview scan, and the supported integration runtime types.
Category | Supported data store | Supported by Azure IR/AWS IR | Supported by SHIR | |
---|---|---|---|---|
Azure | Multiple sources | ✓ | ||
Azure Blob Storage | ✓ | ✓ | ||
Azure Cosmos DB (API for NoSQL) | ✓ | ✓ | ||
Azure Data Explorer | ✓ | ✓ | ||
Azure Data Lake Storage Gen1 | ✓ | ✓ | ||
Azure Data Lake Storage Gen2 | ✓ | ✓ | ||
Azure Database for MySQL | ✓ | ✓ | ||
Azure Database for PostgreSQL | ✓ | ✓ | ||
Azure Databricks Hive Metastore | ✓ | |||
Azure Databricks Unity Catalog | ✓ | |||
Azure Dedicated SQL pool (formerly SQL DW) | ✓ | ✓ | ||
Azure Files | ✓ | ✓ | ||
Azure SQL Database | ✓ | ✓ | ||
Azure SQL Managed Instance | ✓ | ✓ | ||
Azure Synapse Analytics (Workspace) | ✓ | ✓ | ||
Fabric | ✓ | |||
Power BI | ✓ | ✓ |