Choose the right integration runtime configuration for your scenario

The integration runtime (IR) is the compute infrastructure that Microsoft Purview uses to power data scan across different network environments. This article introduces the different types of integration runtime available in Microsoft Purview, and provides guidance on how to choose the right integration runtime configuration for your scenario.

Types of integration runtimes

Microsoft Purview provides the following types of integration runtimes:

  • Azure integration runtime: The Azure integration runtime is a fully managed and elastic compute that you can use to scan Azure or non-Azure data sources. The Azure IR supports connections to data stores and compute services with publicly accessible endpoints. It's the default integration runtime that you don't need to create anything to get started.
  • Self-hosted integration runtime: The self-hosted integration runtime can be used to scan data sources in an on-premises network or a virtual network. You can install it on an on-premises machine or a virtual machine inside your private network. Learn more from Create and manage Self-hosted Integration Runtimes.

Choose the right integration runtime

It's important to choose an appropriate type of integration runtime. Not only must it be suitable for your existing architecture and requirements for data integration, but you also need to consider how to further meet growing business needs and any future increase in workload.

The following consideration can help you navigate the decision:

  1. What data source types do you want to scan?

    Check the supported data sources section to learn about the supported IR types for the data sources you want to scan.

  2. What’s the network access control on your data source?

    Different data source may have different network firewall settings to protect it from random access over the internet, may it be an on-premises or a cloud / SaaS data store. The following table lists some common firewall options. You can choose the supported IR type according to your scenario.

    Data source firewall Supported by Azure IR Supported by SHIR
    Allow public access
    Allow Azure service or trusted service
    Allow access from specific Azure virtual network
    Allow specific IP / IP range
    Other on-premises or private network access
  3. What’s the firewall setting of your Microsoft Purview?

    Microsoft Purview provides different network firewall options. Learn more from Configure Microsoft Purview firewall. You can choose the supported IR type according to your scenario.

    Purview firewall Supported by Azure IR Supported by SHIR
    Enabled from all networks
    Disabled from all networks ✓ (need to create private endpoint from your network)
  4. What level of security do you require during data transmission?

    The integration runtime location defines the location of its back-end compute and where the scan operations are performed. For data residency consideration:

    • When you use Azure IR, Microsoft Purview automatically detects data source's location and uses the IR in that region. If Microsoft Purview can't detect the region, it uses Purview account's region.
    • When you use Managed VNet IR, it runs in the region you configure for the managed virtual network.
    • When you use SHIR, you can fully decide the location in your on-premises or Azure virtual machines.

    To defend against, for example, man-in-the-middle attacks during data transmission, you can choose to use a Private Endpoint and Private Link to ensure data security.

    • You can create managed private endpoints to your data stores when using Managed VNet IR. The private endpoints are maintained by the Microsoft Purview service within the managed virtual network.
    • You can also create private endpoints in your virtual network and the SHIR can use them to access data stores.
  5. What level of maintenance are you able to provide?

    Maintaining infrastructure, servers, and equipment is one of the important tasks of the IT department of an enterprise. It usually takes a lot of time and effort.

    • When using Azure IR and Managed VNet IR, you don’t need to worry about the maintenance such as update, patch and version. The Microsoft Purview service takes care of all the maintenance efforts.
    • Because the SHIR is installed on customer machines, you need to take care of the maintenance. The SHIR supports auto-update to automatically get the latest version whenever there's' an update. Learn more from Self-hosted integration runtime auto-update and expiration.
  6. Performance and scalability

    We recommend you to use the fully managed and autoscaled Azure IR or Managed VNet IR whenever applicable. With the elasticity, they can provide you with better performance and scalability especially when scanning large-scale data systems.

Supported data sources

The table below shows all the data sources that are supported by Microsoft Purview scan, and the supported integration runtime types.

Category Supported data store Supported by Azure IR/AWS IR Supported by SHIR
Azure Multiple sources
Azure Blob Storage
Azure Cosmos DB (API for NoSQL)
Azure Data Explorer
Azure Data Lake Storage Gen1
Azure Data Lake Storage Gen2
Azure Database for MySQL
Azure Database for PostgreSQL
Azure Databricks Hive Metastore
Azure Databricks Unity Catalog
Azure Dedicated SQL pool (formerly SQL DW)
Azure Files
Azure SQL Database
Azure SQL Managed Instance
Azure Synapse Analytics (Workspace)
Fabric
Power BI