配置和安装 Azure 信息保护 (AIP) 统一标记扫描程序Configuring and installing the Azure Information Protection (AIP) unified labeling scanner

*适用于Azure 信息保护、Windows Server 2019、Windows Server 2016、Windows Server 2012 R2**Applies to: Azure Information Protection, Windows Server 2019, Windows Server 2016, Windows Server 2012 R2*

*相关客户端仅限 AIP 统一标记客户端Relevant for: AIP unified labeling client only. 对于经典扫描程序,请参阅配置和安装 Azure 信息保护经典扫描程序For the classic scanner, see Configuring and installing the Azure Information Protection classic scanner.

本文介绍如何配置和安装 Azure 信息保护统一标记本地扫描程序。This article describes how to configure and install the Azure Information Protection unified labeling, on-premises scanner.

概述Overview

在开始之前,请确保你的系统符合所需的先决条件Before you start, verify that your system complies with the required prerequisites.

准备就绪后,继续执行以下步骤:When you're ready, continue with the following steps:

  1. 在 Azure 门户中配置扫描程序Configure the scanner in the Azure portal

  2. 安装扫描程序Install the scanner

  3. 获取扫描程序的 Azure AD 令牌Get an Azure AD token for the scanner

  4. 配置扫描程序以应用分类和保护Configure the scanner to apply classification and protection

然后,根据需要为你的系统执行以下配置过程:Then, perform the following configuration procedures as needed for your system:

过程Procedure 说明Description
更改要保护的文件类型Change which file types to protect 你可能需要扫描、分类或保护非默认的文件类型。You may want to scan, classify, or protect different file types than the default. 有关详细信息,请参阅 AIP 扫描过程For more information, see AIP scanning process.
升级扫描程序Upgrading your scanner 升级扫描程序以利用最新的功能和改进。Upgrade your scanner to leverage the latest features and improvements.
批量编辑数据存储库设置Editing data repository settings in bulk 使用导入和导出选项对多个数据存储库进行批量更改。Use import and export options to make changes in bulk for multiple data repositories.
使用采用替代配置的扫描程序Use the scanner with alternative configurations 在不使用任何条件配置标签的情况下使用扫描程序Use the scanner without configuring labels with any conditions
优化性能Optimize performance 有关优化扫描程序性能的指导Guidance to optimize your scanner performance

有关详细信息,另请参阅支持的 PowerShell cmdletFor more information, see also Supported PowerShell cmdlets.

在 Azure 门户中配置扫描程序Configure the scanner in the Azure portal

备注

Azure 中国门户尚不支持 Azure 信息保护,你可以使用 Azure Information Protection PowerShell commands 实现相同的功能。Azure Information Protection is not currently supported on Azure China portal. You can achieve the same functionality using the Azure Information Protection PowerShell commands.

在安装扫描程序或者从旧的正式版升级扫描程序之前,请在 Azure 门户的“Azure 信息保护”区域中配置或验证扫描程序设置。Before you install the scanner, or upgrade it from an older general availability version, configure or verify your scanner settings in the Azure Information Protection area of the Azure portal.

若要配置扫描程序,请执行以下操作:To configure your scanner:

  1. 使用以下角色之一登录到 Azure 门户Sign in to the Azure portal with one of the following roles:

    • 法规管理员Compliance administrator
    • 合规性数据管理员Compliance data administrator
    • 安全管理员Security administrator
    • 全局管理员Global administrator

    然后导航到“Azure 信息保护”窗格。Then, navigate to the Azure Information Protection pane.

    例如,在资源、服务和文档的搜索框中,开始键入“信息”并选择“Azure 信息保护”。For example, in the search box for resources, services, and docs, start typing Information and select Azure Information Protection.

  2. 创建扫描程序群集Create a scanner cluster. 此群集将定义扫描程序,并用于识别扫描程序实例(例如在执行安装、升级和其他过程期间)。This cluster defines your scanner and is used to identify the scanner instance, such as during installation, upgrades, and other processes.

  3. (可选)扫描网络中有风险的存储库(Optional) Scan your network for risky repositories. 创建一个网络扫描作业用于扫描指定的 IP 地址或范围,并提供有风险存储库的列表,这些存储库可能包含你要保护的敏感内容。Create a network scan job to scan a specified IP address or range, and provide a list of risky repositories that may contain sensitive content you'll want to secure.

    运行网络扫描作业,然后分析找到的所有有风险存储库Run your network scan job and then analyze any risky repositories found.

  4. 创建内容扫描作业以定义要扫描的存储库。Create a content scan job to define the repositories you want to scan.

创建扫描程序群集Create a scanner cluster

  1. 在左侧的“扫描程序”菜单中,选择“群集”“群集”图标From the Scanner menu on the left, select Clusters clusters icon.

  2. 在“Azure 信息保护 - 群集”窗格中,选择“添加”“添加”图标On the Azure Information Protection - Clusters pane, select Add add icon.

  3. 在“添加新群集”窗格中,为扫描程序输入有意义的名称和可选说明。On the Add a new cluster pane, enter a meaningful name for the scanner, and an optional description.

    群集名称用于标识扫描程序的配置和存储库。The cluster name is used to identify the scanner's configurations and repositories. 例如,可以输入 Europe 来标识要扫描的数据存储库的地理位置。For example, you might enter Europe to identify the geographical locations of the data repositories you want to scan.

    稍后将使用此名称来识别要安装或升级扫描程序的位置。You'll use this name later on to identify where you want to install or upgrade your scanner.

  4. 选择“保存”“保存”图标以保存更改。Select Save save icon to save your changes.

创建网络扫描作业(公共预览版)Create a network scan job (public preview)

从版本 2.8.85.0 开始,可以在网络中扫描有风险的存储库。Starting in version 2.8.85.0, you can scan your network for risky repositories. 将找到的一个或多个存储库添加到内容扫描作业,以扫描这些存储库中的敏感内容。Add one or more of the repositories found to a content scan job to scan them for sensitive content.

备注

Azure 信息保护网络发现功能目前以预览版提供。The Azure Information Protection network discovery feature is currently in PREVIEW. Azure 预览版补充条款包含适用于 beta 版、预览版或其他尚未正式发布的 Azure 功能的其他法律条款。The Azure Preview Supplemental Terms include additional legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

下表描述了安装网络发现服务所要满足的先决条件:The following table describes prerequisites required for the network discovery service:

先决条件Prerequisite 说明Description
安装网络发现服务Install the Network Discovery service 如果你最近升级过扫描程序,可能仍需要安装网络发现服务。If you've recently upgraded your scanner, you may need to still install the Network Discovery service.

运行 Install-MIPNetworkDiscovery cmdlet 以启用网络扫描作业。Run the Install-MIPNetworkDiscovery cmdlet to enable network scan jobs.
Azure 信息保护分析Azure Information Protection analytics 确保已启用 Azure 信息保护分析。Make sure that you have Azure Information Protection analytics enabled.

在 Azure 门户中,转到“Azure 信息保护”>“管理”>“配置分析(预览版)”。In the Azure portal, go to Azure Information Protection > Manage > Configure analytics (Preview).

有关详细信息,请参阅 Azure 信息保护的中心报告(公共预览版)For more information, see Central reporting for Azure Information Protection (public preview).

创建网络扫描作业To create a network scan job

  1. 登录到 Azure 门户并转到“Azure 信息保护”。Log in to the Azure portal, and go to Azure Information Protection. 在左侧的“扫描程序”菜单下,选择“网络扫描作业(预览版)”“网络扫描作业”图标Under the Scanner menu on the left, select Network scan jobs (Preview) network scan jobs icon.

  2. 在“Azure 信息保护 - 网络扫描作业”窗格中,选择“添加”“添加”图标On the Azure Information Protection - Network scan jobs pane, select Add add icon.

  3. 在“添加新的网络扫描作业”页上定义以下设置:On the Add a new network scan job page, define the following settings:

    设置Setting 说明Description
    网络扫描作业名称Network scan job name 为此作业输入有意义的名称。Enter a meaningful name for this job. 此字段为必需字段。This field is required.
    说明Description 输入有意义的说明。Enter a meaningful description.
    选择群集Select the cluster 在下拉列表中,选择要用来扫描已配置的网络位置的群集。From the dropdown, select the cluster you want to use to scan the configured network locations.

    提示:选择群集时,请确保分配的群集中的节点可以通过 SMB 访问配置的 IP 范围。Tip: When selecting a cluster, make sure that the nodes in the cluster you assign can access the configured IP ranges via SMB.
    配置要发现的 IP 范围Configure IP ranges to discover 单击以定义 IP 地址或范围。Click to define an IP address or range.

    在“选择 IP 范围”窗格中输入可选的名称,然后输入范围的起始 IP 地址和结束 IP 地址。In the Choose IP ranges pane, enter an optional name, and then a start IP address and end IP address for your range.

    提示:如果只要扫描某个特定 IP 地址,请在“起始 IP”和“结束 IP”字段中输入相同的 IP 地址。 Tip: To scan a specific IP address only, enter the identical IP address in both the Start IP and End IP fields.
    设置计划Set schedule 定义此网络扫描作业的运行频率。Define how often you want this network scan job to run.

    如果选择“每周”,将会出现“运行网络扫描作业的星期日期”设置。 If you select Weekly, the Run network scan job on setting appears. 选择要运行网络扫描作业的星期日期。Select the days of the week where you want the network scan job to run.
    设置开始时间 (UTC)Set start time (UTC) 定义此网络扫描作业开始运行的日期和时间。Define the date and time that you want this network scan job to start running. 如果已选择每日、每周或每月运行该作业,该作业将按照所选的重复周期在定义的时间运行。If you've selected to run the job daily, weekly, or monthly, the job will run at the defined time, at the recurrence you've selected.

    注意:将日期设置为月末的任何日期时请小心。Note: Be careful when setting the date to any days at the end of the month. 如果选择“31”,在天数只有 30 日或更少的月份,网络扫描作业不会运行。If you select 31, the network scan job will not run in any month that has 30 days or fewer.
  4. 选择“保存”“保存”图标以保存更改。Select Save save icon to save your changes.

提示

若要使用不同的扫描程序运行相同的网络扫描,请更改网络扫描作业中定义的群集。If you want to run the same network scan using a different scanner, change the cluster defined in the network scan job.

返回“网络扫描作业”窗格,选择“分配到群集”以立即选择另一个群集,或选择“取消分配群集”以便在以后再进行其他更改。 Return to the Network scan jobs pane, and select Assign to cluster to select a different cluster now, or Unassign cluster to make additional changes later.

分析找到的有风险存储库(公共预览版)Analyze risky repositories found (public preview)

由网络扫描作业或内容扫描作业找到的,或者通过用户访问在日志文件中检测到的存储库将会聚合并列在“扫描程序 > 存储库”“存储库”图标窗格中。Repositories found, either by a network scan job, a content scan job, or by user access detected in log files, are aggregated and listed on the Scanner > Repositories repositories icon pane.

如果已定义网络扫描作业并已将其设置为在特定的日期和时间运行,请等待它运行完成以检查结果。If you've defined a network scan job and have set it to run at a specific date and time, wait until it's finished running to check for results. 也可以在运行内容扫描作业之后返回此处,以查看更新的数据。You can also return here after running a content scan job to view updated data.

备注

Azure 信息保护 存储库 功能目前以预览版提供。The Azure Information Protection Repositories feature is currently in PREVIEW. Azure 预览版补充条款包含适用于 beta 版、预览版或其他尚未正式发布的 Azure 功能的其他法律条款。The Azure Preview Supplemental Terms include additional legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

  1. 在左侧的“扫描程序”菜单下,选择“存储库”“存储库”图标Under the Scanner menu on the left, select Repositories repositories icon.

    找到的存储库如下所示:The repositories found are shown as follows:

    • “按状态列出的存储库”图显示有多少个存储库已配置给内容扫描作业,以及有多少个存储库未配置给内容扫描作业。The Repositories by status graph shows how many repositories are already configured for a content scan job, and how many are not.
    • “按访问列出的前 10 个非托管存储库”图列出当前未分配到内容扫描作业的前 10 个存储库,以及有关其访问级别的详细信息。The Top 10 unmanaged repositories by access graph lists the top 10 repositories that are not currently assigned to a content scan job, as well as details about their access levels. 访问级别可以指示存储库的风险程度。Access levels can indicate how risky your repositories are.
    • 图下面的表列出了找到的每个存储库及其详细信息。The table below the graphs list each repository found and their details.
  2. 执行以下任一操作:Do any of the following:

    选项Option 说明Description
    “列”图标columns icon 选择“列”以更改显示的表列。Select Columns to change the table columns displayed.
    “刷新”图标refresh icon 如果扫描程序最近运行了网络扫描并返回了结果,请选择“刷新”以刷新页面。If your scanner has recently run network scan results, select Refresh to refresh the page.
    “添加”图标add icon 选择表中列出的一个或多个存储库,然后选择“分配选定项”以将选定项分配到内容扫描作业。Select one or more repositories listed in the table, and then select Assign Selected Items to assign them to a content scan job.
    FilterFilter “筛选器”行显示当前应用的任何筛选条件。The filter row shows any filtering criteria currently applied. 选择显示的任一条件以修改其设置,或选择“添加筛选器”以添加新的筛选条件。Select any of the criteria shown to modify its settings, or select Add Filter to add new filtering criteria.

    选择“筛选器”以应用更改,然后使用更新的筛选器来刷新表。Select Filter to apply your changes and refresh the table with the updated filter.
    “Log Analytics”图标Log Analytics icon 在非托管存储库图的右上角,单击“Log Analytics”图标跳转到这些存储库的 Log Analytics 数据。In the top-right corner of the unmanaged repositories graph, click the Log Analytics icon to jump to Log Analytics data for these repositories.

发现其中的“公开访问”设置允许“读取”或“读/写”功能的存储库可能包含必须予以保护的敏感内容。 Repositories where Public access is found to have read or read/write capabilities may have sensitive content that must be secured. 如果“公开访问”为 false,则存储库根本不可公开访问。If Public access is false, the repository not accessible by the public at all.

仅当已在 Install-MIPNetworkDiscoverySet-MIPNetworkDiscoveryConfiguration cmdlet 的 StandardDomainsUserAccount 参数中设置了弱帐户时,才会报告对存储库的公开访问。Public access to a repository is only reported if you've set a weak account in the StandardDomainsUserAccount parameter of the Install-MIPNetworkDiscovery or Set-MIPNetworkDiscoveryConfiguration cmdlets.

  • 在这些参数中定义的帐户用于模拟弱用户对存储库的访问。The accounts defined in these parameters are used to simulate the access of a weak user to the repository. 如果在这些参数中定义的弱用户可以访问存储库,这意味着存储库可公开访问。If the weak user defined there can access the repository, this means that the repository can be accessed publicly.

  • 为了确保正确报告公开访问,请确保在这些参数中指定的用户仅是“域用户”组的成员。To ensure that public access is reported correctly, make sure that the user specified in these parameters is a member of the Domain Users group only.

创建内容扫描作业Create a content scan job

深入到自己的内容才能扫描特定存储库中的敏感内容。Deep dive into your content to scan specific repositories for sensitive content.

你可能只希望在运行网络扫描作业以分析网络中的存储库之后才深入分析内容,不过你也可以自行定义存储库。You may want to do this only after running a network scan job to analyze the repositories in your network, but can also define your repositories yourself.

若要在 Azure 门户中创建内容扫描作业,请执行以下操作:To create your content scan job on the Azure portal:

  1. 在左侧的“扫描程序”菜单下,选择“内容扫描作业”。 Under the Scanner menu on the left, select Content scan jobs.

  2. 在“Azure 信息保护 - 内容扫描作业”窗格中,选择“添加”“添加”图标On the Azure Information Protection - Content scan jobs pane, select Add add icon.

  3. 对于此初始配置,请配置以下设置,然后选择“保存”,但不要关闭窗格。For this initial configuration, configure the following settings, and then select Save but do not close the pane.

    设置Setting 说明Description
    内容扫描作业设置Content scan job settings - 计划:保留默认设置“手动”- Schedule: Keep the default of Manual
    - 要发现的信息类型:更改为“仅策略”- Info types to be discovered: Change to Policy only
    - 配置存储库:暂时不要配置,因为必须先保存内容扫描作业。- Configure repositories: Do not configure at this time because the content scan job must first be saved.
    DLP 策略DLP policy 如果正在使用 Microsoft 365 数据丢失防护 (DLP) 策略,请将“启用 DLP 规则”设置为“打开”。 If you are using a Microsoft 365 Data Loss Prevention (DLP) policy, set Enable DLP rules to On. 有关详细信息,请参阅使用 DLP 策略For more information, see Use a DLP policy.
    敏感度策略Sensitivity policy - 强制:选择“关闭”- Enforce: Select Off
    - 基于内容标记文件:保留默认设置“打开”- Label files based on content: Keep the default of On
    - 默认标签:保留默认设置“策略默认值”- Default label: Keep the default of Policy default
    - 重新标记文件:保留默认设置“关闭”- Relabel files: Keep the default of Off
    配置文件设置Configure file settings - 保留“修改日期”、“上次修改时间”和“修改者” :保留默认设置“打开”- Preserve "Date modified", "Last modified" and "Modified by": Keep the default of On
    - 要扫描的文件类型:保留“排除”的默认文件类型- File types to scan: Keep the default file types for Exclude
    - 默认所有者:保留默认设置“扫描程序帐户”- Default owner: Keep the default of Scanner Account
    - 设置存储库所有者:仅当使用 DLP 策略时才使用此选项。- Set repository owner: Use this option only when using a DLP policy.
  4. 创建并保存内容扫描作业后,可以返回到“配置存储库”选项以指定要扫描的数据存储。Now that the content scan job is created and saved, you're ready to return to the Configure repositories option to specify the data stores to be scanned.

    指定 SharePoint 本地文档库和文件夹的 UNC 路径与 SharePoint Server URL。Specify UNC paths and SharePoint Server URLs for SharePoint on-premises document libraries and folders.

    备注

    SharePoint 支持 SharePoint Server 2019、SharePoint Server 2016 和 SharePoint Server 2013。SharePoint Server 2019, SharePoint Server 2016, and SharePoint Server 2013 are supported for SharePoint. 具有对此版本 SharePoint 的延长支持时,还支持 SharePoint Server 2010。SharePoint Server 2010 is also supported when you have extended support for this version of SharePoint.

    若要添加第一个数据存储,请在“添加新的内容扫描作业”窗格中,选择“配置存储库”打开“存储库”窗格: To add your first data store, while on the Add a new content scan job pane, select Configure repositories to open the Repositories pane:

    为 Azure 信息保护扫描程序配置数据存储库。

    1. 在“存储库”窗格上,选择“添加”:On the Repositories pane, select Add:

      为 Azure 信息保护扫描程序添加数据存储库。

    2. 在“存储库”窗格中指定数据存储库的路径,然后选择“保存”。 On the Repository pane, specify the path for the data repository, and then select Save.

      • 对于网络共享,请使用 \\Server\FolderFor a network share, use \\Server\Folder.
      • 对于 SharePoint 库,请使用 http://sharepoint.contoso.com/Shared%20Documents/FolderFor a SharePoint library, use http://sharepoint.contoso.com/Shared%20Documents/Folder.
      • 对于本地路径:C:\FolderFor a local path: C:\Folder
      • 对于 UNC 路径:\\Server\FolderFor a UNC path: \\Server\Folder

    备注

    不支持通配符,也不支持 WebDav 位置。Wildcards are not supported and WebDav locations are not supported.

    如果为“共享文档”添加 SharePoint 路径:If you add a SharePoint path for Shared Documents:

    • 如果要从“共享文档”扫描所有文档和所有文件夹,请在路径中指定“共享文档”。Specify Shared Documents in the path when you want to scan all documents and all folders from Shared Documents. 例如:http://sp2013/SharedDocumentsFor example: http://sp2013/SharedDocuments
    • 如果要从“共享文档”下的子文件夹扫描所有文档和所有文件夹,请在路径中指定“文档”。Specify Documents in the path when you want to scan all documents and all folders from a subfolder under Shared Documents. 例如:http://sp2013/Documents/SalesReportsFor example: http://sp2013/Documents/SalesReports
    • 或者,仅指定 SharePoint 的 FQDN,例如,指定 http://sp2013发现并扫描特定 URL 下的所有 SharePoint 网站和子网站,以及此 URL 下的副标题。Or, specify only the FQDN of your Sharepoint, for example http://sp2013 to discover and scan all SharePoint sites and subsites under a specific URL and subtitles under this URL. 授予扫描程序“网站收集器审核者”权限即可启用此功能。Grant scanner Site Collector Auditor rights to enable this.

    对于此窗格中的余下设置,请不要在此初始配置中更改这些设置,而是将其保留为“内容扫描作业默认值”。For the remaining settings on this pane, do not change them for this initial configuration, but keep them as Content scan job default. 使用默认设置意味着数据存储库将从内容扫描作业继承设置。The default setting means that the data repository inherits the settings from the content scan job.

    添加 SharePoint 路径时使用以下语法:Use the following syntax when adding SharePoint paths:

    路径Path 语法Syntax
    根路径Root path http://<SharePoint server name>

    扫描所有网站,包括允许扫描程序用户访问的任何网站集。Scans all sites, including any site collections allowed for the scanner user.
    需要分配额外的权限才能自动发现根内容Requires additional permissions to automatically discover root content
    特定的 SharePoint 子网站或网站集Specific SharePoint subsite or collection 下列类型作之一:One of the following:
    - http://<SharePoint server name>/<subsite name>
    - http://SharePoint server name>/<site collection name>/<site name>

    需要分配额外的权限才能自动发现网站集内容Requires additional permissions to automatically discover site collection content
    特定的 SharePoint 库Specific SharePoint library 下列类型作之一:One of the following:
    - http://<SharePoint server name>/<library name>
    - http://SharePoint server name>/.../<library name>
    特定的 SharePoint 文件夹Specific SharePoint folder http://<SharePoint server name>/.../<folder name>
  5. 重复上述步骤以添加任意数目的存储库。Repeat the previous steps to add as many repositories as needed.

    完成后,关闭“存储库”和“内容扫描作业”窗格。 When you're done, close both the Repositories and Content scan job panes.

返回“Azure 信息保护 - 内容扫描作业”窗格,其中会显示内容扫描作业的名称,“计划”列显示“手动”,而“强制”列是空白的。 Back on the Azure Information Protection - Content scan job pane, your content scan name is displayed, together with the SCHEDULE column showing Manual and the ENFORCE column is blank.

现已准备好使用创建的内容扫描程序作业来安装扫描程序。You're now ready to install the scanner with the content scanner job that you've created. 继续安装扫描程序Continue with Install the scanner.

安装扫描程序Install the scanner

在 Azure 门户中配置 Azure 信息保护扫描程序之后,执行以下步骤安装扫描程序:After you've configured the Azure Information Protection scanner in the Azure portal, perform the steps below to install the scanner:

  1. 登录到将要运行扫描程序的 Windows Server 计算机。Sign in to the Windows Server computer that will run the scanner. 使用具有本地管理员权限并具有写入到 SQL Server master 数据库权限的帐户。Use an account that has local administrator rights and that has permissions to write to the SQL Server master database.

    重要

    在安装扫描程序之前,必须先在计算机上安装 AIP 统一标记客户端。You must have the AIP unified labeling client installed on your machine before installing the scanner.

    有关详细信息,请参阅安装和部署 Azure 信息保护扫描程序的先决条件For more information, see Prerequisites for installing and deploying the Azure Information Protection scanner.

  2. 使用“以管理员身份运行”选项打开 Windows PowerShell 会话。Open a Windows PowerShell session with the Run as an administrator option.

  3. 运行 Install-AIPScanner cmdlet,并指定要在其上创建 Azure 信息保护扫描程序数据库的 SQL Server 实例,以及在上一部分指定的扫描程序群集名称:Run the Install-AIPScanner cmdlet, specifying your SQL Server instance on which to create a database for the Azure Information Protection scanner, and the scanner cluster name that you specified in the preceding section:

    Install-AIPScanner -SqlServerInstance <name> -Cluster <cluster name>
    

    例如,使用扫描程序群集名称 EuropeExamples, using the scanner cluster name of Europe:

    • 对于默认实例:Install-AIPScanner -SqlServerInstance SQLSERVER1 -Cluster EuropeFor a default instance: Install-AIPScanner -SqlServerInstance SQLSERVER1 -Cluster Europe

    • 对于命名实例:Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNER -Cluster EuropeFor a named instance: Install-AIPScanner -SqlServerInstance SQLSERVER1\AIPSCANNER -Cluster Europe

    • 对于 SQL Server Express:Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESS -Cluster EuropeFor SQL Server Express: Install-AIPScanner -SqlServerInstance SQLSERVER1\SQLEXPRESS -Cluster Europe

    出现提示时,请提供扫描程序服务帐户的 Active Directory 凭据。When you are prompted, provide the Active Directory credentials for the scanner service account.

    使用以下语法:\<domain\user name>Use the following syntax: \<domain\user name>. 例如:contoso\scanneraccountFor example: contoso\scanneraccount

  4. 使用“管理工具” > “服务”验证服务现在是否已安装。 Verify that the service is now installed by using Administrative Tools > Services.

    已安装的服务被命名为 Azure信息保护扫描程序,并被配置为使用你创建的扫描程序服务帐户运行。The installed service is named Azure Information Protection Scanner and is configured to run by using the scanner service account that you created.

安装扫描程序后,需要获取一个 Azure AD 令牌供扫描程序服务帐户用于身份验证,使扫描程序能够以无人值守的方式运行。Now that you have installed the scanner, you need to get an Azure AD token for the scanner service account to authenticate, so that the scanner can run unattended.

获取扫描程序的 Azure AD 令牌Get an Azure AD token for the scanner

扫描程序可以使用 Azure AD 令牌对 Azure 信息保护服务进行身份验证,这样,扫描程序便能够以非交互方式运行。An Azure AD token allows the scanner to authenticate to the Azure Information Protection service, enabling the scanner to run non-interactively.

有关详细信息,请参阅如何以非交互方式为 Azure 信息保护标记文件For more information, see How to label files non-interactively for Azure Information Protection.

若要获取 Azure AD 令牌,请执行以下操作:To get an Azure AD token:

  1. 返回 Azure 门户,创建一个 Azure AD 应用程序来指定用于身份验证的访问令牌。Return to the Azure portal to create an Azure AD application to specify an access token for authentication.

  2. 在 Windows Server 计算机中,如果为你的扫描程序服务帐户授予了安装时所需的“本地登录”权限,请使用此帐户登录并启动 PowerShell 会话。From the Windows Server computer, if your scanner service account has been granted the Log on locally right for the installation, sign in with this account and start a PowerShell session.

    运行 Set-AIPAuthentication,指定从上一步骤中复制的值:Run Set-AIPAuthentication, specifying the values that you copied from the previous step:

    Set-AIPAuthentication -AppId <ID of the registered app> -AppSecret <client secret sting> -TenantId <your tenant ID> -DelegatedUser <Azure AD account>
    

    例如:For example:

    $pscreds = Get-Credential CONTOSO\scanner
    Set-AIPAuthentication -AppId "77c3c1c3-abf9-404e-8b2b-4652836c8c66" -AppSecret "OAkk+rnuYc/u+]ah2kNxVbtrDGbS47L4" -DelegatedUser scanner@contoso.com -TenantId "9c11c87a-ac8b-46a3-8d5c-f4d0b72ee29a" -OnBehalfOf $pscreds
    Acquired application access token on behalf of CONTOSO\scanner.
    

提示

如果无法为你的扫描程序服务帐户授予安装时所需的“本地登录”权限,请根据 如何以非交互方式为 Azure 信息保护标记文件中所述,结果 OnBehalfOf 参数使用 Set-AIPAuthenticationIf your scanner service account cannot be granted the Log on locally right for the installation, use the OnBehalfOf parameter with Set-AIPAuthentication, as described in How to label files non-interactively for Azure Information Protection.

扫描程序现在具有要对 Azure AD 进行身份验证的令牌。The scanner now has a token to authenticate to Azure AD. 此令牌的有效期为一年、两年或永不过期,具体取决于 Azure AD 中的“Web 应用/API”客户端机密配置。This token is valid for one year, two years, or never, according to your configuration of the Web app /API client secret in Azure AD.

令牌过期后,必须重复此过程。When the token expires, you must repeat this procedure.

现在可随时在发现模式下运行第一次扫描。You're now ready to run your first scan in discovery mode. 有关详细信息,请参阅运行发现周期并查看扫描程序的报告For more information, see Run a discovery cycle and view reports for the scanner.

运行初始发现扫描后,继续配置扫描程序以应用分类和保护Once you've run your initial discovery scan, continue with Configure the scanner to apply classification and protection.

将扫描程序配置为应用分类和保护Configure the scanner to apply classification and protection

默认设置将扫描程序配置为运行一次并采用仅报告模式。The default settings configure the scanner to run once, and in reporting-only mode.

若要更改这些设置,请编辑内容扫描作业:To change these settings, edit the content scan job:

  1. 在 Azure 门户上的“Azure 信息保护 - 内容扫描作业”窗格中,选择群集和内容扫描作业进行编辑。In the Azure portal, on the Azure Information Protection - Content scan jobs pane, select the cluster and content scan job to edit it.

  2. 在“内容扫描作业”窗格中更改以下设置,然后选择“保存”:On the Content scan job pane, change the following, and then select Save:

    • 在“内容扫描作业”部分:将“计划”更改为“始终” From the Content scan job section: Change the Schedule to Always
    • 在“敏感度策略”部分:将“强制”更改为“打开” From the Sensitivity policy section: Change Enforce to On

    提示

    可能需要更改此窗格中的其他设置,例如,是否更改文件特性,以及扫描程序是否可以重新标记文件。You may want to change other settings on this pane, such as whether file attributes are changed and whether the scanner can relabel files. 使用信息弹出通知帮助了解有关每个配置设置的详细信息。Use the information popup help to learn more information about each configuration setting.

  3. 请记下当前时间,并从“Azure 信息保护 - 内容扫描作业”窗格再次启动扫描程序:Make a note of the current time and start the scanner again from the Azure Information Protection - Content scan jobs pane:

    启动 Azure 信息保护扫描程序的扫描。

    或者,在 PowerShell 会话中运行以下命令:Alternatively, run the following command in your PowerShell session:

    Start-AIPScan
    

扫描程序现已计划为连续运行。The scanner is now scheduled to run continuously. 扫描程序在扫描完所有已配置的文件后,会自动启动一个新周期,以便发现任何新文件和已更改的文件。When the scanner works its way through all configured files, it automatically starts a new cycle so that any new and changed files are discovered.

使用 DLP 策略(公共预览版)Use a DLP policy (public preview)

使用 Microsoft 365 数据丢失防护 (DLP) 策略可使扫描程序通过将 DLP 规则与文件共享和 SharePoint 服务器中存储的文件进行匹配,来检测潜在的数据泄露。Using a Microsoft 365 Data Loss Prevention (DLP) policy enables the scanner to detect potential data leaks by matching DLP rules to files stored in file shares and SharePoint Server.

  • 在内容扫描作业中启用 DLP 规则 可以减少与 DLP 策略匹配的任何文件的透露。Enable DLP rules in your content scan job to reduce the exposure of any files that match your DLP policies. 启用 DLP 规则后,扫描程序可以仅限数据所有者进行文件访问,或者减少文件透露在网络范围的组(例如“任何人”、“经过身份验证的用户”或“域用户”)中的情况。 When your DLP rules are enabled, the scanner may reduce file access to data owners only, or reduce exposure to network-wide groups, such as Everyone, Authenticated Users, or Domain Users.

  • 在 Microsoft 365 标记管理中心,确定你是只想测试 DLP 策略,还是要强制实施规则并根据这些规则更改文件权限。In your Microsoft 365 labeling admin center, determine whether you are just testing your DLP policy or whether you want your rules enforced and your file permissions changed according to those rules. 有关详细信息,请参阅启用 DLP 策略For more information, see Turn on a DLP policy.

提示

扫描文件时,即使只是测试 DLP 策略,也需要创建文件权限报告。Scanning your files, even when just testing the DLP policy, also creates file permission reports. 查询这些报告可以调查特定文件的透露情况,或者了解扫描的文件透露于特定用户的情况。Query these reports to investigate specific file exposures or explore the exposure of a specific user to scanned files.

DLP 策略是在标记管理中心(例如 Microsoft 365 合规中心)配置的,在 Azure 信息保护 2.10.43.0 和更高版本中受支持。DLP policies are configured in your labeling admin center, such as the Microsoft 365 Compliance center, and are supported in Azure Information Protection starting in version 2.10.43.0.

有关 DLP 许可的详细信息,请参阅数据丢失防护本地扫描程序入门For more information about DLP licensing, see Get started with the data loss prevention on-premises scanner.

若要在扫描程序中使用 DLP 策略To use a DLP policy with the scanner:

  1. 在 Azure 门户中,导航到你的内容扫描作业。In the Azure portal, navigate to your content scan job. 有关详细信息,请参阅创建内容扫描作业For more information, see Create a content scan job.

  2. 在“DLP 策略”下,将“启用 DLP 规则”设置为“打开”。 Under DLP policy, set Enable DLP rules to On.

    重要

    除非确实在 Microsoft 365 中配置了 DLP 策略,否则不要将“启用 DLP 规则”设置为“打开”。 Do not set Enable DLP rules to On unless you actually have a DLP policy configured in Microsoft 365.

    在没有 DLP 策略的情况下启用此功能会导致扫描程序生成错误。Turning this feature on without a DLP policy will cause the scanner to generate errors.

  3. (可选)在“配置文件设置”下,将“设置存储库所有者”设置为“打开”,并将特定的用户定义为存储库所有者。 (Optional) Under Configure file settings, set the Set repository owner to On, and define a specific user as the repository owner.

    此选项使扫描程序能够减少在此存储库中找到的、与 DLP 策略匹配的任何文件透露于定义的存储库所有者的情况。This option enables the scanner to reduce the exposure of any files found in this repository, which match the DLP policy, to the repository owner defined.

DLP 策略和保密操作DLP policies and make private actions

如果你使用的 DLP 策略包含保密操作,并且你还打算使用扫描程序来自动标记文件,则我们建议另行定义统一标记客户端的 UseCopyAndPreserveNTFSOwner 高级设置。If you are using a DLP policy with a make private action, and are also planning to use the scanner to automatically label your files, we recommend that you also define the unified labeling client's UseCopyAndPreserveNTFSOwner advanced setting.

此设置可确保原始所有者保留对其文件的访问权限。This setting ensures that the original owners retain access to their files.

有关详细信息,请参阅 Microsoft 365 文档中的创建内容扫描作业自动对内容应用敏感度标签For more information, see Create a content scan job and Apply a sensitivity label to content automatically in the Microsoft 365 documentation.

更改要保护的文件类型Change which file types to protect

默认情况下,AIP 扫描程序仅保护 Office 文件类型和 PDF 文件。By default the AIP scanner protects Office file types and PDF files only.

可根据需要使用 PowerShell 命令来更改此行为,例如,将扫描程序配置为像客户端那样保护所有文件类型,或保护其他特定文件类型。Use PowerShell commands to change this behavior as needed, such as to configure the scanner to protect all file types, just as the client does, or to protect additional, specific file types.

对于要应用到为扫描程序下载标签的用户帐户的标签策略,请指定名为 PFileSupportedExtensions 的 PowerShell 高级设置。For a label policy that applies to the user account downloading labels for the scanner, specify a PowerShell advanced setting named PFileSupportedExtensions.

对于能够访问 Internet 的扫描程序,此用户帐户是在 Set-AIPAuthentication 命令中为 DelegatedUser 参数指定的帐户。For a scanner that has access to the internet, this user account is the account that you specify for the DelegatedUser parameter with the Set-AIPAuthentication command.

示例 1:供扫描程序用来保护所有文件类型的 PowerShell 命令,其中的标签策略名为“Scanner”:Example 1: PowerShell command for the scanner to protect all file types, where your label policy is named "Scanner":

Set-LabelPolicy -Identity Scanner -AdvancedSettings @{PFileSupportedExtensions="*"}

示例 2:供扫描程序用来保护 Office 文件和 PDF 文件再加上 .xml 文件和 .tiff 文件的 PowerShell 命令,其中的标签策略名为“Scanner”:Example 2: PowerShell command for the scanner to protect .xml files and .tiff files in addition to Office files and PDF files, where your label policy is named "Scanner":

Set-LabelPolicy -Identity Scanner -AdvancedSettings @{PFileSupportedExtensions=ConvertTo-Json(".xml", ".tiff")}

有关详细信息,请参阅更改要保护的文件类型For more information, see Change which file types to protect.

升级扫描程序Upgrade your scanner

如果你以前安装了扫描程序,现在想要升级,请按照升级 Azure 信息保护扫描程序中的说明操作。If you have previously installed the scanner and want to upgrade, use the instructions described in Upgrading the Azure Information Protection scanner.

然后,像往常一样配置使用扫描程序,但要跳过安装扫描程序的步骤。Then, configure and use your scanner as usual, skipping the steps to install your scanner.

批量编辑数据存储库设置Edit data repository settings in bulk

使用“导出”和“导入”按钮可对多个存储库中的扫描程序进行更改。 Use the Export and Import buttons to make changes for your scanner across several repositories.

这样,就无需在 Azure 门户中多次手动进行相同的更改。This way, you don't need to make the same changes several times, manually, in the Azure portal.

例如,如果你在多个 SharePoint 数据存储库中添加了新的文件类型,你可能希望批量更新这些存储库的设置。For example, if you have a new file type on several SharePoint data repositories, you may want to update the settings for those repositories in bulk.

若要跨存储库进行批量更改,请执行以下操作:To make changes in bulk across repositories:

  1. 在 Azure 门户上的“存储库”窗格中,选择“导出”选项。 In the Azure portal on the Repositories pane, select the Export option. 例如:For example:

    导出 Azure 信息保护扫描程序的数据存储库设置。

  2. 手动编辑导出的文件以进行更改。Manually edit the exported file to make your change.

  3. 使用同一页面上的“导入”选项将更新导入回到存储库。Use the Import option on the same page to import the updates back across your repositories.

使用采用替代配置的扫描程序Use the scanner with alternative configurations

Azure 信息保护扫描程序通常会查找为标签指定的条件,以根据需要对内容进行分类和保护。The Azure Information Protection scanner usually looks for conditions specified for your labels in order to classify and protect your content as needed.

在以下方案中,Azure 信息保护扫描程序也可以扫描内容并管理标签,而无需配置任何条件:In the following scenarios, the Azure Information Protection scanner is also able to scan your content and manage labels, without any conditions configured:

将默认标签应用于数据存储库中的所有文件Apply a default label to all files in a data repository

在此配置中,将使用指定给存储库或内容扫描作业的默认标签,来标记存储库中所有未标记的文件。In this configuration, all unlabeled files in the repository are labeled with the default label specified for the repository or the content scan job. 在不予以检查的情况下标记文件。Files are labeled without inspection.

配置下列设置:Configure the following settings:

设置Setting 说明Description
基于内容标记文件Label files based on content 设置为“关闭”Set to Off
默认标签Default label 设置为“自定义”,然后选择要使用的标签Set to Custom, and then select the label to use
强制默认标签Enforce default label 选择此设置会将默认标签应用于所有文件,即使这些文件已标记。Select to have the default label applied to all files, even if they are already labeled.

从数据存储库中的所有文件中删除现有标签Remove existing labels from all files in a data repository

在此配置中,将删除所有现有标签,包括保护(如果对标签应用了保护)。In this configuration, all existing labels are removed, including protection, if protection was applied with the label. 应用保护和保留标签是相互独立的操作。Protection applied independently of a label is retained.

配置下列设置:Configure the following settings:

设置Setting 说明Description
基于内容标记文件Label files based on content 设置为“关闭”Set to Off
默认标签Default label 设置为“无”Set to None
重新标记文件Relabel files 在选中“强制默认标签”复选框的情况下,将此项设置为“打开” Set to On, with the Enforce default label checkbox selected

识别所有自定义条件和已知敏感信息类型Identify all custom conditions and known sensitive information types

使用此配置可以发现你可能未意识到的敏感信息,代价是会降低扫描程序的扫描速度。This configuration enables you to find sensitive information that you might not realize you had, at the expense of scanning rates for the scanner.

将“要发现的信息类型”设置为“所有”。 Set the Info types to be discovered to All.

为了识别标记的条件和信息类型,扫描程序将使用指定的任何自定义敏感信息类型,以及在标记管理中心定义的可供选择的内置敏感信息类型列表。To identify conditions and information types for labeling, the scanner uses any custom sensitive information types specified, and the list of built-in sensitive information types that are available to select, as defined in your labeling management center.

优化扫描程序性能Optimize scanner performance

备注

如果你正在寻求改善扫描程序计算机的响应速度而不是扫描程序性能,请使用一项高级客户端设置来限制扫描程序使用的线程数If you are looking to improve the responsiveness of the scanner computer rather than the scanner performance, use an advanced client setting to limit the number of threads used by the scanner.

使用以下选项和指导来帮助优化扫描程序性能:Use the following options and guidance to help you optimize scanner performance:

选项Option 说明Description
在扫描程序计算机和被扫描的数据存储之间建立高速可靠的网络连接Have a high speed and reliable network connection between the scanner computer and the scanned data store 例如,将扫描程序计算机放到同一 LAN 中,最好是放到所要扫描的数据存储所在的同一网段中。For example, place the scanner computer in the same LAN, or preferably, in the same network segment as the scanned data store.

网络连接质量会影响扫描程序的性能,因为若要检查文件,扫描程序需将文件内容传输到运行扫描程序服务的计算机中。The quality of the network connection affects the scanner performance because, to inspect the files, the scanner transfers the contents of the files to the computer running the scanner service.

减少或消除传输数据所需的网络跃点数还可以降低网络上的负载。Reducing or eliminating the network hops required for the data to travel also reduces the load on your network.
确保扫描程序计算机具有可用的处理器资源Make sure the scanner computer has available processor resources 检查文件内容以及加密和解密文件是处理器密集型操作。Inspecting the file contents and encrypting and decrypting files are processor-intensive actions.

监视指定数据存储的典型扫描周期,以确定缺少处理器资源是否对扫描程序性能造成负面影响。Monitor the typical scanning cycles for your specified data stores to identify whether a lack of processor resources is negatively affecting the scanner performance.
安装扫描程序的多个实例Install multiple instances of the scanner 指定扫描程序的自定义群集名称时,Azure 信息保护扫描程序支持同一 SQL Server 实例上的多个配置数据库。The Azure Information Protection scanner supports multiple configuration databases on the same SQL server instance when you specify a custom cluster name for the scanner.

多个扫描程序还可以共享同一群集,从而缩短扫描时间。Multiple scanners can also share the same cluster, resulting in quicker scanning times.
检查替代配置的用法Check your alternative configuration usage 在使用备选配置将默认标签应用于所有文件时,扫描程序可以更快地运行,因为扫描程序不检查文件内容。The scanner runs more quickly when you use the alternative configuration to apply a default label to all files because the scanner does not inspect the file contents.

如果你使用替换配置标识所有自定义条件和已知敏感信息类型,扫描程序的运行速度会更慢。The scanner runs more slowly when you use the alternative configuration to identify all custom conditions and known sensitive information types.

影响性能的其他因素Additional factors that affect performance

影响扫描程序性能的其他因素包括:Additional factors that affect the scanner performance include:

因子Factor 说明Description
负载/响应时间Load/response times 包含要扫描的文件的数据存储的当前负载和响应时间也会影响扫描程序性能。The current load and response times of the data stores that contain the files to scan will also affect scanner performance.
扫描程序模式(发现/强制)Scanner mode (Discovery / Enforce) 发现模式的扫描速度通常高于强制模式。Discovery mode typically has a higher scanning rate than enforce mode.

发现模式只需执行文件读取操作一次,而强制模式则需要执行读取和写入操作。Discovery requires a single file read action, whereas enforce mode requires read and write actions.
策略更改Policy changes 如果对标签策略中的自动标记做了更改,扫描程序的性能可能会受影响。Your scanner performance may be affected if you've made changes to the autolabeling in the label policy.

在第一个扫描周期,扫描程序必须检查每个文件,而后续扫描周期默认仅检查新文件和更改的文件,因此第一个周期比后续周期耗时长。Your first scan cycle, when the scanner must inspect every file, will take longer than subsequent scan cycles that by default, inspect only new and changed files.

如果更改了条件或自动标记设置,将再次扫描所有文件。If you change the conditions or autolabeling settings, all files are scanned again. 有关详细信息,请参阅重新扫描文件For more information, see Rescanning files.
正则表达式构造Regex constructions 自定义条件所用正则表达式的构造方式会影响扫描程序性能。Scanner performance is affected by how your regex expressions for custom conditions are constructed.

为避免占用过多内存并存在超时风险(每个文件 15 分钟),请查看正则表达式了解有效的模式匹配。To avoid heavy memory consumption and the risk of timeouts (15 minutes per file), review your regex expressions for efficient pattern matching.

例如:For example:
- 避免贪婪限定符- Avoid greedy quantifiers
- 使用 (?:expression) 等非捕获组,而不是 (expression)- Use non-capturing groups such as (?:expression) instead of (expression)
日志级别Log level 扫描程序报告的日志级别选项包括“调试”、“信息”、“错误”和“关闭”。 Log level options include Debug, Info, Error and Off for the scanner reports.

- 使用“关闭”可获得最佳性能- Off results in the best performance
- 使用“调试”会明显减慢扫描程序的速度,应仅在故障排除时使用。- Debug considerably slows down the scanner and should be used only for troubleshooting.

有关详细信息,请参阅 Set-AIPScannerConfiguration cmdlet 的 eportLevel 参数。For more information, see the ReportLevel parameter for the Set-AIPScannerConfiguration cmdlet.
正在扫描的文件Files being scanned - Office 文件(Excel 文件除外)的扫描速度比 PDF 文件要快。- With the exception of Excel files, Office files are more quickly scanned than PDF files.

- 扫描未受保护的文件比扫描受保护的文件速度更快。- Unprotected files are quicker to scan than protected files.

- 扫描大文件明显比扫描小文件耗时更多。- Large files obviously take longer to scan than small files.

支持的 PowerShell cmdletSupported PowerShell cmdlets

本部分列出了 Azure 信息保护扫描程序支持的 PowerShell cmdlet。This section lists PowerShell cmdlets supported for the Azure Information Protection scanner.

扫描程序支持的 cmdlet 包括:Supported cmdlets for the scanner include:

后续步骤Next steps

安装并配置扫描程序后,请开始扫描文件Once you've installed and configured your scanner, start scanning your files.

另请参阅:部署 Azure 信息保护扫描程序以自动对文件进行分类和保护See also: Deploying the Azure Information Protection scanner to automatically classify and protect files.

详细信息More information: