数据发现和分类Data Discovery & Classification

适用于: 是Azure SQL 数据库是Azure SQL 托管实例是Azure Synapse Analytics (SQL DW) APPLIES TO: yesAzure SQL Database yesAzure SQL Managed Instance yes Azure Synapse Analytics (SQL DW)

数据发现和分类内置于 Azure SQL 数据库、Azure SQL 托管实例和 Azure Synapse Analytics 中。Data Discovery & Classification is built into Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics. 它提供用于发现、分类、标记和报告数据库中的敏感数据的高级功能。It provides advanced capabilities for discovering, classifying, labeling, and reporting the sensitive data in your databases.

最敏感的数据可能包括业务、财务、医疗保健或个人信息。Your most sensitive data might include business, financial, healthcare, or personal information. 发现和分类这些数据在组织的信息保护方法中发挥举足轻重的作用。Discovering and classifying this data can play a pivotal role in your organization's information-protection approach. 它可以充当基础结构,用于:It can serve as infrastructure for:

  • 帮助满足数据隐私标准和法规符合性要求。Helping to meet standards for data privacy and requirements for regulatory compliance.
  • 各种安全方案,如监视(审核)并在敏感数据存在异常访问时发出警报。Various security scenarios, such as monitoring (auditing) and alerting on anomalous access to sensitive data.
  • 控制对包含高度敏感数据的数据库的访问并增强其安全性。Controlling access to and hardening the security of databases that contain highly sensitive data.

数据发现和分类包含在高级数据安全产品/服务中,该产品/服务是高级 Azure SQL 安全功能的统一包。Data Discovery & Classification is part of the Advanced Data Security offering, which is a unified package for advanced Azure SQL security capabilities. 可通过 Azure 门户的“SQL 高级数据安全”中心部分访问及管理数据发现和分类。You can access and manage Data Discovery & Classification via the central SQL Advanced Data Security section of the Azure portal.

备注

要了解本地 SQL Server,请参阅 SQL 数据发现和分类For information about SQL Server on-premises, see SQL Data Discovery & Classification.

什么是数据发现和分类?What is Data Discovery & Classification?

数据发现和分类在 Azure 中引入了一组高级服务和新功能。Data Discovery & Classification introduces a set of advanced services and new capabilities in Azure. 它构成适用于 SQL 数据库、SQL 托管实例和 Azure Synapse 的信息保护范例,旨在保护数据,而不仅仅是数据库。It forms a new information-protection paradigm for SQL Database, SQL Managed Instance, and Azure Synapse, aimed at protecting the data and not just the database. 范例包括:The paradigm includes:

  • 发现和建议: 分类引擎扫描数据库,并识别包含潜在敏感数据的列。Discovery and recommendations: The classification engine scans your database and identifies columns that contain potentially sensitive data. 使用此功能可以通过 Azure 门户轻松地查看和应用建议的分类。It then provides you with an easy way to review and apply recommended classification via the Azure portal.

  • 标记: 可通过使用已添加到 SQL Server 数据库引擎的新元数据属性,将敏感度分类标签永久应用于列。Labeling: You can apply sensitivity-classification labels persistently to columns by using new metadata attributes that have been added to the SQL Server database engine. 然后,此元数据可用于基于敏感度的高级审核和保护方案。This metadata can then be used for advanced, sensitivity-based auditing and protection scenarios.

  • 查询结果集敏感度: 出于审核目的实时计算查询结果集的敏感度。Query result-set sensitivity: The sensitivity of a query result set is calculated in real time for auditing purposes.

  • 可见性: 可以在 Azure 门户的详细仪表板中查看数据库分类状态。Visibility: You can view the database-classification state in a detailed dashboard in the Azure portal. 此外,还可下载用于符合性和审核目的以及其他需求的报表(Excel 格式)。Also, you can download a report in Excel format to use for compliance and auditing purposes and other needs.

发现、分类和标记敏感列Discover, classify, and label sensitive columns

本部分介绍用于以下方案的步骤:This section describes the steps for:

  • 发现、分类和标记数据库中包含敏感数据的列。Discovering, classifying, and labeling columns that contain sensitive data in your database.
  • 查看数据库的当前分类状态并导出报表。Viewing the current classification state of your database and exporting reports.

分类包含两种元数据属性:The classification includes two metadata attributes:

  • 标签:主要分类属性,用于定义列中存储的数据的敏感度级别。Labels: The main classification attributes, used to define the sensitivity level of the data stored in the column.
  • 信息类型:提供有关列中存储的数据类型的更详尽信息的属性。Information types: Attributes that provide more granular information about the type of data stored in the column.

定义和自定义分类Define and customize your classification taxonomy

数据发现和分类附带了一组内置的敏感度标签和一组内置的信息类型和发现逻辑。Data Discovery & Classification comes with a built-in set of sensitivity labels and a built-in set of information types and discovery logic. 现在,可以自定义此分类并专门针对你的环境定义分类构造的集合和级别。You can now customize this taxonomy and define a set and ranking of classification constructs specifically for your environment.

可在一个中心位置针对整个 Azure 组织定义和自定义分类。You define and customize of your classification taxonomy in one central place for your entire Azure organization. 该位置在 Azure 安全中心内,是安全策略的一部分。That location is in Azure Security Center, as part of your security policy. 仅对组织根管理组具有管理权限的人员可以执行此任务。Only someone with administrative rights on the organization's root management group can do this task.

在信息保护的策略管理过程中,可定义自定义标签、对其进行分级,并将其与选定的一组信息类型相关联。As part of policy management for information protection, you can define custom labels, rank them, and associate them with a selected set of information types. 还可以添加自己的自定义信息类型,并使用字符串模式对其进行配置。You can also add your own custom information types and configure them with string patterns. 这些模式已添加到用于识别数据库中的此类型数据的发现逻辑。The patterns are added to the discovery logic for identifying this type of data in your databases.

定义组织范围的策略后,可以继续使用自定义策略对各个数据库进行分类。After the organization-wide policy has been defined, you can continue classifying individual databases by using your customized policy.

对数据库进行分类Classify your database

备注

下面的示例使用的是 Azure SQL 数据库,但你应选择要配置数据发现和分类的适当产品。The below example uses Azure SQL Database, but you should select the appropriate product that you want to configure Data Discovery & Classification.

  1. 转到 Azure 门户Go to the Azure portal.

  2. 转到 Azure SQL 数据库窗格的“安全”标题下的“高级数据安全” 。Go to Advanced Data Security under the Security heading in your Azure SQL Database pane. 选择“高级数据安全”,然后选择“数据发现和分类”卡 。Select Advanced data security, and then select the Data Discovery & Classification card.

    Azure 门户中的“高级数据安全”窗格

  3. “数据发现和分类”页面的“概述”选项卡中有数据库当前分类状态的摘要 。On the Data Discovery & Classification page, the Overview tab includes a summary of the current classification state of the database. 该摘要包含所有分类列的详细列表,你还可以对其进行筛选,以便仅显示特定的架构部分、信息类型和标签。The summary includes a detailed list of all classified columns, which you can also filter to show only specific schema parts, information types, and labels. 如果尚未对任何列进行分类,请跳到步骤 5If you haven't classified any columns yet, skip to step 5.

    当前分类状态摘要

  4. 若要下载 Excel 格式的报表,请选择窗格顶部菜单中的“导出”。To download a report in Excel format, select Export in the top menu of the pane.

  5. 若要开始对数据进行分类,请选择“数据发现和分类”页面的“分类”选项卡 。To begin classifying your data, select the Classification tab on the Data Discovery & Classification page.

    分类引擎扫描数据库,寻找包含潜在敏感数据的列,并提供建议的列分类列表。The classification engine scans your database for columns containing potentially sensitive data and provides a list of recommended column classifications.

  6. 查看并应用分类建议:View and apply classification recommendations:

    • 若要查看建议的列分类列表,请选择窗格底部的“建议”面板。To view the list of recommended column classifications, select the recommendations panel at the bottom of the pane.

    • 若要接受针对特定列的建议,请选中相关行左侧列中的复选框。To accept a recommendation for a specific column, select the check box in the left column of the relevant row. 若要将所有建议标记为已接受,请选中建议表标题中最左侧的复选框。To mark all recommendations as accepted, select the leftmost check box in the recommendations table header.

      查看分类建议列表并从中进行选择

    • 若要应用所选建议,请选择“接受所选建议”。To apply the selected recommendations, select Accept selected recommendations.

  7. 还可以手动对列进行分类,这是基于建议分类的替代选项:You can also classify columns manually, as an alternative or in addition to the recommendation-based classification:

    1. 选择窗格顶部菜单中的“添加分类”。Select Add classification in the top menu of the pane.

    2. 在打开的上下文窗口中,选择要分类的架构、表和列,并选择信息类型和敏感度标签。In the context window that opens, select the schema, table, and column that you want to classify, and the information type and sensitivity label.

    3. 选择上下文窗口底部的“添加分类”。Select Add classification at the bottom of the context window.

      选择要进行分类的列

  8. 若要完成分类并永久使用新分类元数据标记数据库列,请在窗口顶部菜单中选择“保存”。To complete your classification and persistently label (tag) the database columns with the new classification metadata, select Save in the top menu of the window.

审核对敏感数据的访问Audit access to sensitive data

信息保护范例的一个重要方面是能够监视对敏感数据的访问。An important aspect of the information-protection paradigm is the ability to monitor access to sensitive data. Azure SQL 审核已得到增强,在审核日志中包括了名为 data_sensitivity_information 的新字段。Azure SQL Auditing has been enhanced to include a new field in the audit log called data_sensitivity_information. 此字段记录查询返回的数据的敏感度分类(标签)。This field logs the sensitivity classifications (labels) of the data that was returned by a query. 下面是一个示例:Here's an example:

审核日志

权限Permissions

以下内置角色可读取数据库的数据分类:These built-in roles can read the data classification of a database:

  • 所有者Owner
  • 读取器Reader
  • 参与者Contributor
  • SQL 安全管理器SQL Security Manager
  • 用户访问管理员User Access Administrator

以下内置角色可修改数据库的数据分类:These built-in roles can modify the data classification of a database:

  • 所有者Owner
  • 参与者Contributor
  • SQL 安全管理器SQL Security Manager

Azure RBAC 中了解有关基于角色的权限的详细信息。Learn more about role-based permissions in Azure RBAC.

管理分类Manage classifications

可以使用 T-SQL、REST API 或 PowerShell 来管理分类。You can use T-SQL, a REST API, or PowerShell to manage classifications.

使用 T-SQLUse T-SQL

可以使用 T-SQL 添加或删除列分类,以及检索整个数据库的所有分类。You can use T-SQL to add or remove column classifications, and to retrieve all classifications for the entire database.

备注

如果使用 T-SQL 管理标签,则不会验证组织信息保护策略(门户建议中显示的标签集)中是否存在添加到列的标签。When you use T-SQL to manage labels, there's no validation that labels that you add to a column exist in the organization's information-protection policy (the set of labels that appear in the portal recommendations). 因此,是否要验证这一点完全由你决定。So, it's up to you to validate this.

有关使用 T-SQL 进行分类的信息,请参阅以下参考内容:For information about using T-SQL for classifications, see the following references:

使用 PowerShell cmdletUse PowerShell cmdlets

可使用 PowerShell 管理 Azure SQL 数据库和 Azure SQL 托管实例的分类和建议。Manage classifications and recommendations for Azure SQL Database and Azure SQL Managed Instance using PowerShell.

适用于 Azure SQL 数据库的 PowerShell cmdletPowerShell cmdlets for Azure SQL Database

适用于 Azure SQL 托管实例的 PowerShell cmdletPowerShell cmdlets for Azure SQL Managed Instance

使用 REST APIUse the Rest API

可以使用 REST API 以编程方式管理分类和建议。You can use the REST API to programmatically manage classifications and recommendations. 已发布的 REST API 支持以下操作:The published REST API supports the following operations:

后续步骤Next steps