Discover and govern Azure SQL Database in Microsoft Purview

This article outlines the process to register an Azure SQL database source in Microsoft Purview. It includes instructions to authenticate and interact with the SQL database.

Supported capabilities

Metadata Extraction Full Scan Incremental Scan Scoped Scan Classification Access Policy Lineage
Yes Yes Yes Yes Yes Yes Yes (preview)

When you're scanning Azure SQL Database, Microsoft Purview supports extracting technical metadata from these sources:

  • Server
  • Database
  • Schemas
  • Tables, including columns
  • Views, including columns (with lineage extraction enabled, as part of scanning)
  • Stored procedures (with lineage extraction enabled)
  • Stored procedure runs (with lineage extraction enabled)

When you're setting up a scan, you can further scope it after providing the database name by selecting tables and views as needed.

Known limitations

  • For stored procedures lineage extraction:
    • Stored Procedures (SP) lineage extraction scan is currently not supported if your logical server in Azure disables public access or doesn't allow Azure services to access it.
    • SP lineage extraction scan is currently not supported if your Microsoft Purview account disables public access.
    • SP lineage extraction scan is scheduled to run every six hours by default. The frequency can't be changed.
    • Lineage is captured only when the stored procedure execution transfers data from one table to another. And it's not supported for temporary tables.
    • Lineage extraction isn't supported for functions or triggers.
    • Note due to the following limitations, currently you could see duplicate assets in the catalog if you have such scenarios.
      • The object names in assets and fully qualified names follow the case used in stored procedure statements, which might not align with the object case in original data source.
      • When SQL views are referenced in stored procedures, they're currently captured as SQL tables.

Note

Lineage is also supported if Azure SQL tables or views are used as a source/sink in Azure Data Factory Copy and Data Flow activities.

Prerequisites

Register the data source

Before you scan, it's important to register the data source in Microsoft Purview:

  1. Open the Microsoft Purview governance portal by:

  2. Navigate to the Data Map.

    Screenshot that shows the area for opening a Microsoft Purview governance portal.

  3. Create the collection hierarchy by going to Collections and then selecting Add a collection. Assign permissions to individual subcollections as required.

    Screenshot that shows selections for assigning access control permissions to the collection hierarchy.

  4. Go to the appropriate collection under Sources, and then select the Register icon to register a new SQL database.

    Screenshot that shows the collection that's used to register the data source.

  5. Select the Azure SQL Database data source, and then select Continue.

  6. For Name, provide a suitable name for the data source. Select relevant names for Azure subscription, Server name, and Select a collection, and then select Apply.

    Screenshot that shows details entered to register a data source.

  7. Confirm that the SQL database appears under the selected collection.

    Screenshot that shows a data source mapped to a collection to initiate scanning.

Update firewall settings

If your database server has a firewall enabled, you need to update the firewall to allow access in one of the following ways:

For more information about the firewall, see the Azure SQL Database firewall documentation.

Allow Azure connections

Enabling Azure connections will allow Microsoft Purview to connect to the server without requiring you to update the firewall itself.

  1. Go to your database account.
  2. On the Overview page, select the server name.
  3. Select Security > Networking.
  4. For Allow Azure services and resources to access this server, select Yes.

Screenshot that shows selections in the Azure portal to allow Azure connections to a server.

For more information about allowing connections from inside Azure, see the how-to guide.

Install a self-hosted integration runtime

You can install a self-hosted integration runtime on a machine to connect with a resource in a private network:

  1. Choose the right integration runtime for your scenario
  2. Create and install your integration runtime:
  3. Check your database server's networking configuration to confirm that a private endpoint is accessible to the machine that contains the self-hosted integration runtime. Add the IP address of the machine if it doesn't already have access.
  4. If your logical server is behind a private endpoint or in a virtual network, you can use an ingestion private endpoint to ensure end-to-end network isolation.

Configure authentication for a scan

To scan your data source, you need to configure an authentication method in Azure SQL Database.

Important

If you're using a self-hosted integration runtime to connect to your resource, system-assigned and user-assigned managed identities won't work. You need to use service principal authentication or SQL authentication.

Microsoft Purview supports the following options:

  • System-assigned managed identity (SAMI) (recommended). This is an identity that's associated directly with your Microsoft Purview account. It allows you to authenticate directly with other Azure resources without needing to manage a go-between user or credential set.

    The SAMI is created when your Microsoft Purview resource is created. It's managed by Azure and uses your Microsoft Purview account's name. The SAMI can't currently be used with a self-hosted integration runtime for Azure SQL.

    For more information, see the managed identity overview.

  • User-assigned managed identity (UAMI) (preview). Similar to a SAMI, a UAMI is a credential resource that allows Microsoft Purview to authenticate against Microsoft Entra ID.

    The UAMI is managed by users in Azure, rather than by Azure itself, which gives you more control over security. The UAMI can't currently be used with a self-hosted integration runtime for Azure SQL.

    For more information, see the guide for user-assigned managed identities.

  • Service principal. A service principal is an application that can be assigned permissions like any other group or user, without being associated directly with a person. Authentication for service principals has an expiration date, so it can be useful for temporary projects.

    For more information, see the service principal documentation.

  • SQL authentication. Connect to the SQL database with a username and password. For more information, see the SQL authentication documentation.

    If you need to create a login, follow this guide to query a SQL database. Use this guide to create a login by using T-SQL.

    Note

    Be sure to select the Azure SQL Database option on the page.

For steps to authenticate with your SQL database, select your chosen method of authentication from the following tabs.

Note

Only the server-level principal login (created by the provisioning process) or members of the loginmanager database role in the master database can create new logins. The Microsoft Purview account should be able to scan the resources about 15 minutes after it gets permissions.

  1. You need a SQL login with at least db_datareader permissions to be able to access the information that Microsoft Purview needs to scan the database. You can follow the instructions in CREATE LOGIN to create a sign-in for Azure SQL Database. Save the username and password for the next steps.

  2. Go to your key vault in the Azure portal.

  3. Select Settings > Secrets, and then select + Generate/Import.

    Screenshot that shows the key vault option to generate a secret.

  4. For Name and Value, use the username and password (respectively) from your SQL database.

  5. Select Create.

  6. If your key vault isn't connected to Microsoft Purview yet, create a new key vault connection.

  7. Create a new credential by using the key to set up your scan.

    Screenshot that shows the key vault option to set up credentials.

    Screenshot that shows the key vault option to create a secret.

Create the scan

  1. Open your Microsoft Purview account and select Open Microsoft Purview governance portal.

  2. Go to Data map > Sources to view the collection hierarchy.

  3. Select the New Scan icon under the SQL database that you registered earlier.

    Screenshot that shows the pane for creating a new scan.

To learn more about stored procedure data lineage in Azure SQL Database, see the Extract lineage (preview) section of this article.

For scanning steps, select your method of authentication from the following tabs.

  1. For Name, provide a name for the scan.

  2. For Database selection method, select Enter manually.

  3. For Database name and Credential, enter the values that you created earlier.

    Screenshot that shows database and credential information for the SQL authentication option to run a scan.

  4. For Select a connection, choose the appropriate collection for the scan.

  5. Select Test connection to validate the connection. After the connection is successful, select Continue.

Scope and run the scan

  1. You can scope your scan to specific database objects by choosing the appropriate items in the list.

    Screenshot that shows options for scoping a scan.

  2. Select a scan rule set. You can use the system default, choose from existing custom rule sets, or create a new rule set inline. Select Continue when you're finished.

    Screenshot that shows options for selecting a scan rule set.

    If you select New scan rule set, a pane opens so that you can enter the source type, the name of the rule set, and a description. Select Continue when you're finished.

    Screenshot that shows information for creating a new scan rule set.

    For Select classification rules, choose the classification rules that you want to include in the scan rule set, and then select Create.

    Screenshot that shows a list of classification rules for a scan rule set.

    The new scan rule set then appears in the list of available rule sets.

    Screenshot that shows the selection of a new scan rule set.

  3. Choose your scan trigger. You can set up a schedule or run the scan once.

  4. Review your scan, and then select Save and run.

View a scan

To check the status of a scan, go to the data source in the collection, and then select View details.

Screenshot that shows the button for viewing details of a scan.

The scan details indicate the progress of the scan in Last run status, along with the number of assets scanned and classified. Last run status is updated to In progress and then Completed after the entire scan has run successfully.

Screenshot that shows a completed status for the last scan run.

Manage a scan

After you run a scan, you can use the run history to manage it:

  1. Under Recent scans, select a scan.

    Screenshot that shows the selection of a recently completed scan.

  2. In the run history, you have options for running the scan again, editing it, or deleting it.

    Screenshot that shows options for running, editing, and deleting a scan.

    If you select Run scan now to rerun the scan, you can then choose either Incremental scan or Full scan.

    Screenshot that shows options for full or incremental scan.

Troubleshoot scanning

If you have problems with scanning, try these tips:

For more information, review Troubleshoot your connections in Microsoft Purview.

Set up policies

The following types of Microsoft Purview policies are supported on this data resource:

  • Self-service policies - policy that allows users to request access to data sources registered to Microsoft Purview.
  • DevOps policies - grants access to database system metadata across multiple sources. They simplify access provisioning for IT operations and security auditing personnel. They only grant access, and don't deny access.

Access policy prerequisites on Azure SQL Database

Region support

All Microsoft Purview regions are supported.

Configure the Azure SQL Database instance for policies from Microsoft Purview

For the logical server associated with Azure SQL Database to honor policies from Microsoft Purview, you need to configure a Microsoft Entra admin. In the Azure portal, go to the logical server that hosts the Azure SQL Database instance. On the side menu, select Microsoft Entra ID. Set an admin name to any Microsoft Entra user or group that you prefer, and then select Save.

Then, on the side menu, select Identity. Under System assigned managed identity, turn the status to On, and then select Save.

Screenshot that shows the assignment of a system-assigned managed identity to a logical server associated with Azure SQL Database.

Configure the Microsoft Purview account for policies

Register the data source in Microsoft Purview

Before a policy can be created in Microsoft Purview for a data resource, you must register that data resource in Microsoft Purview Studio. You will find the instructions related to registering the data resource later in this guide.

Note

Azure Purview policies rely on the data resource ARM path. If a data resource is moved to a new resource group or subscription it will need to be de-registered and then registered again in Microsoft Purview.

Configure permissions to enable Data policy enforcement on the data source

Once a resource is registered, but before a policy can be created in Microsoft Purview for that resource, you must configure permissions. A set of permissions are needed to enable the Data policy enforcement. This applies to data sources, resource groups, or subscriptions. To enable Data policy enforcement, you must have both specific Identity and Access Management (IAM) privileges on the resource as well as specific Microsoft Purview privileges:

  • You must have either one of the following IAM role combinations on the resource's Azure Resource Manager path or any parent of it (that is, using IAM permission inheritance):

    • IAM Owner
    • Both IAM Contributor and IAM User Access Administrator

    To configure Azure role-based access control (RBAC) permissions, follow this guide. The following screenshot shows how to access the Access Control section in the Azure portal for the data resource to add a role assignment.

    Screenshot that shows the section in the Azure portal for adding a role assignment.

    Note

    The IAM Owner role for a data resource can be inherited from a parent resource group, a subscription, or a subscription management group. Check which Microsoft Entra users, groups, and service principals hold or are inheriting the IAM Owner role for the resource.

  • You also need to have the Microsoft Purview Data source admin role for the collection or a parent collection (if inheritance is enabled). For more information, see the guide on managing Microsoft Purview role assignments.

    The following screenshot shows how to assign the Data source admin role at the root collection level.

    Screenshot that shows selections for assigning the Data source admin role at the root collection level.

Configure Microsoft Purview permissions to create, update, or delete access policies

To create, update or delete policies, you need to get the Policy author role in Microsoft Purview at root collection level:

  • The Policy author role can create, update, and delete DevOps and Data Owner policies.
  • The Policy author role can delete self-service access policies.

For more information about managing Microsoft Purview role assignments, see Create and manage collections in the Microsoft Purview Data Map.

Note

Policy author role must be configured at the root collection level.

In addition, to easily search Microsoft Entra users or groups when creating or updating the subject of a policy, you can greatly benefit from getting the Directory Readers permission in Microsoft Entra ID. This is a common permission for users in an Azure tenant. Without the Directory Reader permission, the Policy Author will have to type the complete username or email for all the principals included in the subject of a data policy.

Delegate access provisioning responsibility to roles in Microsoft Purview

After a resource has been enabled for Data policy enforcement, any Microsoft Purview user with the Policy author role at the root collection level can provision access to that data source from Microsoft Purview.

Note

Any Microsoft Purview root Collection admin can assign new users to root Policy author roles. Any Collection admin can assign new users to a Data source admin role under the collection. Minimize and carefully vet the users who hold Microsoft Purview Collection admin, Data source admin, or Policy author roles.

If a Microsoft Purview account with published policies is deleted, such policies will stop being enforced within an amount of time that depends on the specific data source. This change can have implications on both security and data access availability. The Contributor and Owner roles in IAM can delete Microsoft Purview accounts. You can check these permissions by going to the Access control (IAM) section for your Microsoft Purview account and selecting Role Assignments. You can also use a lock to prevent the Microsoft Purview account from being deleted through Resource Manager locks.

Register the data source and enable Data policy enforcement

The Azure SQL Database resource needs to be registered with Microsoft Purview before you can create access policies. To register your resources, follow the "Prerequisites" and "Register the data source" sections in Enable Data policy enforcement on your Microsoft Purview sources.

After you register the data source, you need to enable Data policy enforcement. This is a prerequisite before you can create policies on the data source. Data policy enforcement can affect the security of your data, because it delegates to certain Microsoft Purview roles that manage access to the data sources. Go through the security practices in Enable Data policy enforcement on your Microsoft Purview sources.

After your data source has the Data policy enforcement option set to Enabled, it will look like this screenshot:

Screenshot that shows the panel for registering a data source for a policy, including areas for name, server name, and data policy enforcement.

Return to the Azure portal for Azure SQL Database to verify it's now governed by Microsoft Purview:

  1. Sign in to the Azure portal through this link

  2. Select the Azure SQL Server that you want to configure.

  3. Go to Microsoft Entra ID on the left pane.

  4. Scroll down to Microsoft Purview access policies.

  5. Select the button to Check for Microsoft Purview Governance. Wait while the request is processed. It could take a few minutes.

    Screenshot that shows Azure SQL is governed by Microsoft Purview.

  6. Confirm that the Microsoft Purview Governance Status shows Governed. Note that it could take a few minutes after you enable Data policy enforcement in Microsoft Purview for the correct status to be reflected.

Note

If you disable Data policy enforcement for this Azure SQL Database data source, it might take up to 24 hours for the Microsoft Purview Governance Status to be updated automatically to Not Governed. This can be accelerated by selecting Check for Microsoft Purview Governance. Before you enable Data policy enforcement for the data source in another Microsoft Purview account, ensure that the Purview Governance Status shows as Not Governed. Then repeat the steps above with the new Microsoft Purview account.

Create an access policy

To create an access policy for Azure SQL Database, follow these guides:

To create policies that cover all data sources inside a resource group or Azure subscription, see Discover and govern multiple Azure sources in Microsoft Purview.

Extract lineage (preview)

Microsoft Purview supports lineage for views and stored procedures from Azure SQL Database. While lineage for views is supported as part of scanning, you will need to turn on the Lineage extraction toggle to extract stored procedure lineage when you're setting up a scan.

Note

Lineage is not currently supported using a self-hosted integration runtime or managed VNET runtime and a Azure SQL private endpoint. You need to enable Azure services to access the server under network settings for your Azure SQL Database, and your Microsoft Purview account needs to allow public access. Learn more about the known limitations in lineage extraction scan.

Lineage for SQL DB views

Starting 6/30/24, SQL DB metadata scan will include lineage extraction for views. Only new scans will include the view lineage extraction. Lineage is extracted at all scan levels (L1/L2/L3). In case of an incremental scan, whatever metadata is scanned as part of incremental scan, the corresponding static lineage for tables/views will be extracted.

Screenshot that shows lineage details for SQL DB views.

Prerequisites for setting up a scan with SP lineage extraction

  1. Follow the steps in the Configure authentication for a scan section of this article to authorize Microsoft Purview to scan your SQL database.

  2. Sign in to Azure SQL Database with your Microsoft Entra account, and assign db_owner permissions to the Microsoft Purview managed identity.

    Note

    The 'db_owner' permissions is needed because lineage is based on XEvent sessions. So Microsoft Purview needs the permission to manage the XEvent sessions in SQL.

    Use the following example SQL syntax to create a user and grant permission. Replace <purview-account> with your account name.

    Create user <purview-account> FROM EXTERNAL PROVIDER
    GO
    EXEC sp_addrolemember 'db_owner', <purview-account> 
    GO
    
  3. Run the following command on your SQL database to create a master key:

    Create master key
    Go
    
  4. Ensure that Allow Azure services and resources to access this server is enabled under networking/firewall for your Azure SQL resource.

Create a scan with lineage extraction turned on

  1. On the pane for setting up a scan, turn on the Enable lineage extraction toggle.

    Screenshot that shows the pane for creating a new scan, with lineage extraction turned on.

  2. Select your method of authentication by following the steps in the Create the scan section of this article.

  3. After you successfully set up the scan, a new scan type called Lineage extraction will run incremental scans every six hours to extract lineage from Azure SQL Database. Lineage is extracted based on the stored procedure runs in the SQL database.

    Screenshot that shows the screen that runs lineage extraction every six hours.

Search Azure SQL Database assets and view runtime lineage

You can browse through the data catalog or search the data catalog to view asset details for Azure SQL Database. The following steps describe how to view runtime lineage details:

  1. Go to the Lineage tab for the asset. When applicable, the asset lineage appears here.

    Screenshot that shows lineage details from stored procedures.

    When applicable, you can further drill down to see the lineage at SQL statement level within a stored procedure, along with column level lineage. When using Self-hosted Integration Runtime for scan, retrieving the lineage drilldown information during scan is supported since version 5.25.8374.1.

    Screenshot that shows stored procedure lineage drilldown.

    For information about supported Azure SQL Database lineage scenarios, refer to the Supported capabilities section of this article. For more information about lineage in general, see Data lineage in Microsoft Purview and Microsoft Purview Data Catalog lineage user guide.

  2. Go to the stored procedure asset. On the Properties tab, go to Related assets to get the latest run details of stored procedures.

    Screenshot that shows run details for stored procedure properties.

  3. Select the stored procedure hyperlink next to Runs to see the Azure SQL Stored Procedure Run overview. Go to the Properties tab to see enhanced runtime information from the stored procedure, such as executedTime, rowCount, and Client Connection.

    Screenshot that shows run properties for a stored procedure.

Troubleshoot lineage extraction for Stored Procedures

The following tips can help you solve problems related to lineage:

  • If no lineage is captured after a successful Lineage extraction run, it's possible that no stored procedures have run at least once since you set up the scan.
  • Lineage is captured for stored procedure runs that happen after a successful scan is set up. Lineage from past stored procedure runs isn't captured.
  • If your database is processing heavy workloads with lots of stored procedure runs, lineage extraction will filter only the most recent runs. Stored procedure runs early in the six-hour window, or the run instances that create heavy query load, won't be extracted. Contact support if you're missing lineage from any stored procedure runs.
  • If a stored procedure contains drop or create statements, they aren't currently captured in lineage

Next steps

To learn more about Microsoft Purview and your data, use these guides: