如何在 Azure 认知搜索中按语言筛选How to filter by language in Azure Cognitive Search

多语言搜索应用程序的一项关键要求是能够按用户自己的语言搜索和检索结果。A key requirement in a multilingual search application is the ability to search over and retrieve results in the user's own language. 在 Azure 认知搜索中,满足多语言应用的语言要求的方法之一是创建专门用于按特定语言存储字符串的一系列字段,然后在查询时将全文搜索限定于这些字段。In Azure Cognitive Search, one way to meet the language requirements of a multilingual app is to create a series of fields dedicated to storing strings in a specific language, and then constrain full text search to just those fields at query time.

请求中的查询参数用于限定搜索操作的范围,同时修剪无法提供与所需搜索体验兼容的内容的任何字段的结果。Query parameters on the request are used to both scope the search operation, and then trim the results of any fields that don't provide content compatible with the search experience you want to deliver.

parametersParameters 目的Purpose
searchFieldssearchFields 将全文搜索限制为命名字段的列表。Limits full text search to the list of named fields.
$select$select 修剪响应,以便只包含指定的字段。Trims the response to include only the fields you specify. 默认情况下,会返回所有可检索字段。By default, all retrievable fields are returned. 使用 $Select 参数可以选择要返回哪些字段。The $select parameter lets you choose which ones to return.

此方法能够成功取决于字段内容的完整性。The success of this technique hinges on the integrity of field contents. Azure 认知搜索不会转换字符串,也不执行语言检测。Azure Cognitive Search does not translate strings or perform language detection. 你负责确保字段包含预期的字符串。It is up to you to make sure that fields contain the strings you expect.

为采用不同语言的内容定义字段Define fields for content in different languages

在 Azure 认知搜索中,查询以单个索引为目标。In Azure Cognitive Search, queries target a single index. 想要在单个搜索体验中提供特定于语言的字符串的开发人员通常会定义专用字段来存储值:一个字段用于存储英语字符串,一个字段用于存储法语字符串,等等。Developers who want to provide language-specific strings in a single search experience typically define dedicated fields to store the values: one field for English strings, one for French, and so on.

在我们的示例(包括下面所示的房地产示例)中,可以看到类似于以下屏幕截图的字段定义。In our samples, including the real-estate sample shown below, you might have seen field definitions similar to the following screenshot. 请注意此示例如何此索引中显示字段的语言分析器分配。Notice how this example shows the language analyzer assignments for the fields in this index. 如果与旨在处理目标语言的语言规则的分析器搭配使用,包含字符串的字段可在全文搜索中更好地发挥作用。Fields that contain strings perform better in full text search when paired with an analyzer engineered to handle the linguistic rules of the target language.

Note

有关通过语言分析器显示字段定义的代码示例,请参阅定义索引 (.NET)定义索引 (REST)For code examples showing field definitions with languages analyzers, see Define an index (.NET) and Define an index (REST).

生成和加载索引Build and load an index

编写查询之前的一个中间步骤(也许是众所周知的步骤)是生成并填充索引An intermediate (and perhaps obvious) step is that you have to build and populate the index before formulating a query. 为了保持内容完整,此处阐述了此步骤。We mention this step here for completeness. 确定索引是否可用的一种方法是在门户中查看索引列表。One way to determine whether the index is available is by checking the indexes list in the portal.

约束查询和修剪结果Constrain the query and trim results

查询中的参数用于将搜索范围限制为特定的字段,然后修剪对方案无用的任何字段的结果。Parameters on the query are used to limit search to specific fields and then trim the results of any fields not helpful to your scenario. 假设目标是将搜索范围限定于包含法语字符串的字段,则可以使用 searchFields 将查询目标指定为包含法语字符串的字段。Given a goal of constraining search to fields containing French strings, you would use searchFields to target the query at fields containing strings in that language.

默认情况下,搜索会返回标记为可检索的所有字段。By default, a search returns all fields that are marked as retrievable. 因此,可能需要排除不符合想要提供的特定于语言的搜索体验的字段。As such, you might want to exclude fields that don't conform to the language-specific search experience you want to provide. 具体而言,如果将搜索范围限制为包含法语字符串的字段,也许需要从结果中排除包含英语字符串的字段。Specifically, if you limited search to a field with French strings, you probably want to exclude fields with English strings from your results. 使用 $select 查询参数可以控制要将哪些字段返回到调用应用程序。Using the $select query parameter gives you control over which fields are returned to the calling application.

parameters =
    new SearchParameters()
    {
        searchFields = "description_fr" 
        Select = new[] { "description_fr"  }
    };

Note

尽管查询不包含 $filter 自变量,但此用例与筛选概念密切相关,因此我们将它作为筛选方案进行演示。Although there is no $filter argument on the query, this use case is strongly affiliated with filter concepts, so we present it as a filtering scenario.

另请参阅See also