对照以 C# 编写的自定义术语列表检查文本Check text against a custom term list in C#

Azure 内容审查器中的默认全局术语列表足以满足大多数内容审查需求。The default global list of terms in Azure Content Moderator is sufficient for most content moderation needs. 但是,可能需要屏蔽特定于组织的术语。However, you might need to screen for terms that are specific to your organization. 例如,可能需要标记竞争对手的名称作进一步审查。For example, you might want to tag competitor names for further review.

可使用适用于 .NET 的内容审查器 SDK 创建供文本审查 API 使用的自定义术语列表。You can use the Content Moderator SDK for .NET to create custom lists of terms to use with the Text Moderation API.

本文中的信息和代码示例有助于用户快速开始使用用于 .NET 的内容审查器 SDK,以执行下列操作:This article provides information and code samples to help you get started using the Content Moderator SDK for .NET to:

  • 创建列表。Create a list.
  • 向列表添加术语。Add terms to a list.
  • 针对列表中的术语屏蔽术语。Screen terms against the terms in a list.
  • 从列表中删除术语。Delete terms from a list.
  • 删除列表。Delete a list.
  • 编辑列表信息。Edit list information.
  • 筛选索引,使新的扫描中包含列表更改项。Refresh the index so that changes to the list are included in a new scan.

如果没有 Azure 订阅,可在开始前创建一个试用帐户If you don't have an Azure subscription, create a Trial before you begin.

注册内容审查器服务Sign up for Content Moderator services

必须有订阅密钥,才能通过 REST API 或 SDK 使用内容审查器服务。Before you can use Content Moderator services through the REST API or the SDK, you'll need a subscription key. Azure 门户中订阅内容审查器服务以获取其中一个。Subscribe to Content Moderator service in the Azure portal to obtain one.

创建 Visual Studio 项目Create your Visual Studio project

  1. 向解决方案添加新的“控制台应用(.NET Framework)” 项目。Add a new Console app (.NET Framework) project to your solution.

  2. 将该项目命名为 TermLists 。Name the project TermLists. 将此项目选为解决方案的单一启动项目。Select this project as the single startup project for the solution.

安装所需程序包Install required packages

为 TermLists 项目安装以下 NuGet 包:Install the following NuGet packages for the TermLists project:

  • Microsoft.Azure.CognitiveServices.ContentModeratorMicrosoft.Azure.CognitiveServices.ContentModerator
  • Microsoft.Rest.ClientRuntimeMicrosoft.Rest.ClientRuntime
  • Microsoft.Rest.ClientRuntime.AzureMicrosoft.Rest.ClientRuntime.Azure
  • Newtonsoft.JsonNewtonsoft.Json

更新程序的 using 语句Update the program's using statements

添加以下 using 语句。Add the following using statements.

using Microsoft.Azure.CognitiveServices.ContentModerator;
using Microsoft.Azure.CognitiveServices.ContentModerator.Models;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;

Create the Content Moderator clientCreate the Content Moderator client

添加以下代码来为订阅创建内容审查器客户端。Add the following code to create a Content Moderator client for your subscription. 使用终结点 URL 和订阅密钥的值更新 AzureEndpointCMSubscriptionKey 字段。Update the AzureEndpoint and CMSubscriptionKey fields with the values of your endpoint URL and subscription key. 可在 Azure 门户中资源的“快速启动”选项卡中找到它们。You can find these in the Quick start tab of your resource in the Azure portal.

/// <summary>
/// Wraps the creation and configuration of a Content Moderator client.
/// </summary>
/// <remarks>This class library contains insecure code. If you adapt this 
/// code for use in production, use a secure method of storing and using
/// your Content Moderator subscription key.</remarks>
public static class Clients
{
    /// <summary>
    /// The base URL fragment for Content Moderator calls.
    /// </summary>
    private static readonly string AzureEndpoint = "YOUR ENDPOINT URL";

    /// <summary>
    /// Your Content Moderator subscription key.
    /// </summary>
    private static readonly string CMSubscriptionKey = "YOUR API KEY";

    /// <summary>
    /// Returns a new Content Moderator client for your subscription.
    /// </summary>
    /// <returns>The new client.</returns>
    /// <remarks>The <see cref="ContentModeratorClient"/> is disposable.
    /// When you have finished using the client,
    /// you should dispose of it either directly or indirectly. </remarks>
    public static ContentModeratorClient NewClient()
    {
        // Create and initialize an instance of the Content Moderator API wrapper.
        ContentModeratorClient client = new ContentModeratorClient(new ApiKeyServiceClientCredentials(CMSubscriptionKey));

        client.Endpoint = AzureEndpoint;
        return client;
    }
}

添加私有属性Add private properties

将以下私有属性添加到命名空间 TermLists 的 Program 类。Add the following private properties to namespace TermLists, class Program.

/// <summary>
/// The language of the terms in the term lists.
/// </summary>
private const string lang = "eng";

/// <summary>
/// The minimum amount of time, in milliseconds, to wait between calls
/// to the Content Moderator APIs.
/// </summary>
private const int throttleRate = 3000;

/// <summary>
/// The number of minutes to delay after updating the search index before
/// performing image match operations against the list.
/// </summary>
private const double latencyDelay = 0.5;

创建术语列表Create a term list

使用 ContentModeratorClient.ListManagementTermLists.Create 创建术语列表。You create a term list with ContentModeratorClient.ListManagementTermLists.Create. 要创建的第一个参数是一个包含 MIME 类型的字符串,此类型应为“application/json”。The first parameter to Create is a string that contains a MIME type, which should be "application/json". 有关详细信息,请参阅 API 参考For more information, see the API reference. 第二个参数是 Body 对象,该对象包含新术语列表的名称和说明。The second parameter is a Body object that contains a name and description for the new term list.

备注

最多只能使用 5 个术语列表,每个列表中的术语数不得超过 10,000 个。There is a maximum limit of 5 term lists with each list to not exceed 10,000 terms.

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

备注

内容审查器服务密钥有每秒请求数 (RPS) 速率限制。如果超出此限制,SDK 就会抛出异常(错误代码为 429)。Your Content Moderator service key has a requests-per-second (RPS) rate limit, and if you exceed the limit, the SDK throws an exception with a 429 error code. 免费层密钥有一个单 RPS 速率限制。A free tier key has a one-RPS rate limit.

/// <summary>
/// Creates a new term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <returns>The term list ID.</returns>
static string CreateTermList (ContentModeratorClient client)
{
    Console.WriteLine("Creating term list.");

    Body body = new Body("Term list name", "Term list description");
    TermList list = client.ListManagementTermLists.Create("application/json", body);
    if (false == list.Id.HasValue)
    {
        throw new Exception("TermList.Id value missing.");
    }
    else
    {
        string list_id = list.Id.Value.ToString();
        Console.WriteLine("Term list created. ID: {0}.", list_id);
        Thread.Sleep(throttleRate);
        return list_id;
    }
}

更新术语列表名称和说明Update term list name and description

使用 ContentModeratorClient.ListManagementTermLists.Update 更新术语列表信息。You update the term list information with ContentModeratorClient.ListManagementTermLists.Update. 要更新的第一个参数是术语列表 ID。The first parameter to Update is the term list ID. 第二个参数是应为“application/json”的 MIME 类型。The second parameter is a MIME type, which should be "application/json". 有关详细信息,请参阅 API 参考For more information, see the API reference. 第三个参数是 Body 对象,它包含新名称和说明。The third parameter is a Body object, which contains the new name and description.

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Update the information for the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list to update.</param>
/// <param name="name">The new name for the term list.</param>
/// <param name="description">The new description for the term list.</param>
static void UpdateTermList (ContentModeratorClient client, string list_id, string name = null, string description = null)
{
    Console.WriteLine("Updating information for term list with ID {0}.", list_id);
    Body body = new Body(name, description);
    client.ListManagementTermLists.Update(list_id, "application/json", body);
    Thread.Sleep(throttleRate);
}

向术语列表添加术语Add a term to a term list

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Add a term to the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list to update.</param>
/// <param name="term">The term to add to the term list.</param>
static void AddTerm (ContentModeratorClient client, string list_id, string term)
{
    Console.WriteLine("Adding term \"{0}\" to term list with ID {1}.", term, list_id);
    client.ListManagementTerm.AddTerm(list_id, term, lang);
    Thread.Sleep(throttleRate);
}

获取术语列表中的所有术语Get all terms in a term list

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Get all terms in the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list from which to get all terms.</param>
static void GetAllTerms(ContentModeratorClient client, string list_id)
{
    Console.WriteLine("Getting terms in term list with ID {0}.", list_id);
    Terms terms = client.ListManagementTerm.GetAllTerms(list_id, lang);
    TermsData data = terms.Data;
    foreach (TermsInList term in data.Terms)
    {
        Console.WriteLine(term.Term);
    }
    Thread.Sleep(throttleRate);
}

添加代码以刷新搜索索引Add code to refresh the search index

对术语列表进行更改后,刷新其搜索索引,使更改在下次使用术语列表时包含在内。After you make changes to a term list, you refresh its search index for the changes to be included the next time you use the term list to screen text. 此步骤类似于桌面上的搜索引擎(如果启用)或 Web 搜索引擎的操作,即不断刷新其索引以包含新文件或页面。This is similar to how a search engine on your desktop (if enabled) or a web search engine continually refreshes its index to include new files or pages.

使用 ContentModeratorClient.ListManagementTermLists.RefreshIndexMethod 刷新术语列表搜索索引。You refresh a term list search index with ContentModeratorClient.ListManagementTermLists.RefreshIndexMethod.

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Refresh the search index for the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list to refresh.</param>
static void RefreshSearchIndex (ContentModeratorClient client, string list_id)
{
    Console.WriteLine("Refreshing search index for term list with ID {0}.", list_id);
    client.ListManagementTermLists.RefreshIndexMethod(list_id, lang);
    Thread.Sleep((int)(latencyDelay * 60 * 1000));
}

屏蔽使用术语列表的文本Screen text using a term list

通过 ContentModeratorClient.TextModeration.ScreenText 屏蔽使用术语列表的文本,它将采用以下参数。You screen text using a term list with ContentModeratorClient.TextModeration.ScreenText, which takes the following parameters.

  • 术语列表中的术语所采用的语言。The language of the terms in the term list.
  • MIME 类型,可以是“text/html”、“text/xml”、“text/markdown”或“text/plain”。A MIME type, which can be "text/html", "text/xml", "text/markdown", or "text/plain".
  • 要屏蔽的文本。The text to screen.
  • 布尔值。A boolean value. 将此字段设置为 true,在屏蔽它之前自动更正文本。Set this field to true to autocorrect the text before screening it.
  • 布尔值。A boolean value. 将此字段设置为 true 以检测文本中的个人数据。Set this field to true to detect personal data in the text.
  • 术语列表 ID。The term list ID.

有关详细信息,请参阅 API 参考For more information, see the API reference.

ScreenText 返回 Screen 对象,该对象具有 Terms 属性,此属性可列出内容审查器在屏蔽期间检测到的任何术语。ScreenText returns a Screen object, which has a Terms property that lists any terms that Content Moderator detected in the screening. 请注意,如果屏蔽期间内容审查器未检测到任何术语,则 Terms 属性的值为 null。Note that if Content Moderator did not detect any terms during the screening, the Terms property has value null.

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Screen the indicated text for terms in the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list to use to screen the text.</param>
/// <param name="text">The text to screen.</param>
static void ScreenText (ContentModeratorClient client, string list_id, string text)
{
    Console.WriteLine("Screening text: \"{0}\" using term list with ID {1}.", text, list_id);
    Screen screen = client.TextModeration.ScreenText(lang, "text/plain", text, false, false, list_id);
    if (null == screen.Terms)
    {
        Console.WriteLine("No terms from the term list were detected in the text.");
    }
    else
    {
        foreach (DetectedTerms term in screen.Terms)
        {
            Console.WriteLine(String.Format("Found term: \"{0}\" from list ID {1} at index {2}.", term.Term, term.ListId, term.Index));
        }
    }
    Thread.Sleep(throttleRate);
}

删除术语和列表Delete terms and lists

删除术语或列表非常简单。Deleting a term or a list is straightforward. 使用 SDK 执行以下任务:You use the SDK to do the following tasks:

  • 删除术语。Delete a term. (ContentModeratorClient.ListManagementTerm.DeleteTerm)(ContentModeratorClient.ListManagementTerm.DeleteTerm)
  • 删除列表中的所有术语而不删除列表。Delete all the terms in a list without deleting the list. (ContentModeratorClient.ListManagementTerm.DeleteAllTerms)(ContentModeratorClient.ListManagementTerm.DeleteAllTerms)
  • 删除列表及其所有内容。Delete a list and all of its contents. (ContentModeratorClient.ListManagementTermLists.Delete)(ContentModeratorClient.ListManagementTermLists.Delete)

删除术语Delete a term

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Delete a term from the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list from which to delete the term.</param>
/// <param name="term">The term to delete.</param>
static void DeleteTerm (ContentModeratorClient client, string list_id, string term)
{
    Console.WriteLine("Removed term \"{0}\" from term list with ID {1}.", term, list_id);
    client.ListManagementTerm.DeleteTerm(list_id, term, lang);
    Thread.Sleep(throttleRate);
}

删除术语列表中的所有术语Delete all terms in a term list

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Delete all terms from the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list from which to delete all terms.</param>
static void DeleteAllTerms (ContentModeratorClient client, string list_id)
{
    Console.WriteLine("Removing all terms from term list with ID {0}.", list_id);
    client.ListManagementTerm.DeleteAllTerms(list_id, lang);
    Thread.Sleep(throttleRate);
}

删除术语列表Delete a term list

将以下方法定义添加到 TermLists 命名空间中的 Program 类。Add the following method definition to namespace TermLists, class Program.

/// <summary>
/// Delete the indicated term list.
/// </summary>
/// <param name="client">The Content Moderator client.</param>
/// <param name="list_id">The ID of the term list to delete.</param>
static void DeleteTermList (ContentModeratorClient client, string list_id)
{
    Console.WriteLine("Deleting term list with ID {0}.", list_id);
    client.ListManagementTermLists.Delete(list_id);
    Thread.Sleep(throttleRate);
}

编写 Main 方法Compose the Main method

Main 方法定义添加到 TermLists 命名空间中的 Program 类。Add the Main method definition to namespace TermLists, class Program. 最后,关闭 Program 类和 TermLists 命名空间。Finally, close the Program class and the TermLists namespace.

static void Main(string[] args)
{
    using (var client = Clients.NewClient())
    {
        string list_id = CreateTermList(client);

        UpdateTermList(client, list_id, "name", "description");
        AddTerm(client, list_id, "term1");
        AddTerm(client, list_id, "term2");

        GetAllTerms(client, list_id);

        // Always remember to refresh the search index of your list
        RefreshSearchIndex(client, list_id);

        string text = "This text contains the terms \"term1\" and \"term2\".";
        ScreenText(client, list_id, text);

        DeleteTerm(client, list_id, "term1");

        // Always remember to refresh the search index of your list
        RefreshSearchIndex(client, list_id);

        text = "This text contains the terms \"term1\" and \"term2\".";
        ScreenText(client, list_id, text);

        DeleteAllTerms(client, list_id);
        DeleteTermList(client, list_id);

        Console.WriteLine("Press ENTER to close the application.");
        Console.ReadLine();
    }
}

运行应用程序以查看输出Run the application to see the output

控制台输出将如下所示:Your console output will look like the following:

Creating term list.
Term list created. ID: 252.
Updating information for term list with ID 252.

Adding term "term1" to term list with ID 252.
Adding term "term2" to term list with ID 252.

Getting terms in term list with ID 252.
term1
term2

Refreshing search index for term list with ID 252.

Screening text: "This text contains the terms "term1" and "term2"." using term list with ID 252.
Found term: "term1" from list ID 252 at index 32.
Found term: "term2" from list ID 252 at index 46.

Removed term "term1" from term list with ID 252.

Refreshing search index for term list with ID 252.

Screening text: "This text contains the terms "term1" and "term2"." using term list with ID 252.
Found term: "term2" from list ID 252 at index 46.

Removing all terms from term list with ID 252.
Deleting term list with ID 252.
Press ENTER to close the application.

后续步骤Next steps

为适用于 .NET 的此内容审查器快速入门和其他内容审查器快速入门获取内容审查器 .NET SDKVisual Studio 解决方案,并开始集成。Get the Content Moderator .NET SDK and the Visual Studio solution for this and other Content Moderator quickstarts for .NET, and get started on your integration.