创建建议器,以在查询中启用“自动完成”和“建议结果”功能Create a suggester to enable autocomplete and suggested results in a query

在 Azure 认知搜索中,“键入时搜索”是通过添加到搜索索引的建议器构造启用的****。In Azure Cognitive Search, "search-as-you-type" is enabled through a suggester construct added to a search index. 建议器支持两种体验:“自动完成”,它将整个术语查询的一部分输入补充完整,和“建议”,它邀请执行指向特定匹配的一系列单击操作** **。A suggester supports two experiences: autocomplete, which completes a partial input for a whole term query, and suggestions that invites click through to a particular match. “自动完成”生成查询。Autocomplete produces a query. “建议”生成匹配的文档。Suggestions produce a matching document.

取自在 C# 中创建第一个应用的以下屏幕截图演示了这两种体验。The following screenshot from Create your first app in C# illustrates both. 自动完成可预测潜在字词,并使用“in”来补充“tw”。Autocomplete anticipates a potential term, finishing "tw" with "in". 建议是极其精简的搜索结果,其中的字段(例如酒店名称)表示索引中匹配的酒店搜索文档。Suggestions are mini search results, where a field like hotel name represents a matching hotel search document from the index. 对于建议,可以呈现任何提供描述性信息的字段。For suggestions, you can surface any field that provides descriptive information.

自动完成和建议查询的直观比较Visual comparison of autocomplete and suggested queries

可以单独使用这些功能,或将它们一起使用。You can use these features separately or together. 有一个索引和查询组件可在 Azure 认知搜索中实现这些行为。To implement these behaviors in Azure Cognitive Search, there is an index and query component.

  • 在索引中,将建议器添加到索引。In the index, add a suggester to an index. 可以使用门户、REST API.NET SDKYou can use the portal, REST API, or .NET SDK. 本文的余下内容重点介绍如何创建建议器。The remainder of this article is focused on creating a suggester.

  • 在查询请求中,调用下面列出的 API 之一。In the query request, call one of the APIs listed below.

对于字符串字段,可按字段启用“键入时搜索”支持。Search-as-you-type support is enabled on a per-field basis for string fields. 若要获得屏幕截图中所示的类似体验,可以在同一搜索解决方案中实现这两种自动提示行为。You can implement both typeahead behaviors within the same search solution if you want an experience similar to the one indicated in the screenshot. 这两个请求针对特定索引的文档集合,在用户提供至少包含三个字符的输入字符串后,将返回响应。**Both requests target the documents collection of specific index and responses are returned after a user has provided at least a three character input string.

什么是建议器?What is a suggester?

建议器是一种内部数据结构,它通过存储用于匹配部分查询的前缀来支持“键入时搜索”行为。A suggester is an internal data structure that supports search-as-you-type behaviors by storing prefixes for matching on partial queries. 与标记化字词一样,前缀存储在倒排索引中,建议器字段集合中指定的每个字段都有一个倒排索引。As with tokenized terms, prefixes are stored in inverted indexes, one for each field specified in a suggester fields collection.

定义建议器Define a suggester

若要创建建议器,请将一个建议器添加到索引架构设置每个属性To create a suggester, add one to an index schema and set each property. 创建建议器的最佳时间是还要定义使用建议器的字段时。The best time to create a suggester is when you are also defining the field that will use it.

  • 仅使用字符串字段Use string fields only

  • 使用字段上默认的标准 Lucene 分析器 ("analyzer": null) 或语言分析器(例如 "analyzer": "en.Microsoft"Use the default standard Lucene analyzer ("analyzer": null) or a language analyzer (for example, "analyzer": "en.Microsoft") on the field

选择字段Choose fields

虽然建议器有多个属性,但它主要是需要为其启用“键入时搜索”体验的字符串字段集合。Although a suggester has several properties, it is primarily a collection of string fields for which you are enabling a search-as-you-type experience. 每个索引都有一个建议器,因此建议器列表必须包含所有为“建议”和“自动完成”贡献内容的字段。There is one suggester for each index, so the suggester list must include all fields that contribute content for both suggestions and autocomplete.

自动完成可获益于更大的字段池,因为额外的内容可更好地帮助补充和形成完整词语。Autocomplete benefits from a larger pool of fields to draw from because the additional content has more term completion potential.

另一方面,当字段为可选时,“建议”能生成更好的结果。Suggestions, on the other hand, produce better results when your field choice is selective. 请记住,“建议”是搜索文档的代理,因此需要最能代表单个结果的字段。Remember that the suggestion is a proxy for a search document so you will want fields that best represent a single result. 用于区分多个匹配项的名称、标题或其他唯一字段最合适。Names, titles, or other unique fields that distinguish among multiple matches work best. 如果字段包含重复值,则建议会包含相同的结果,因此用户不知道要单击哪条建议。If fields consist of repetitive values, the suggestions consist of identical results and a user won't know which one to click.

为实现“边键入边搜索”的体验,请添加“自动完成”所需的所有字段,然后使用 $select、$top、$filter 和 searchFields 来控制建议的结果**** **** **** ****。To satisfy both search-as-you-type experiences, add all of the fields that you need for autocomplete, but then use $select, $top, $filter, and searchFields to control results for suggestions.

选择分析器Choose analyzers

所选的分析器决定了如何标记化字段并随后指定其前缀。Your choice of an analyzer determines how fields are tokenized and subsequently prefixed. 例如,对于带连字符的字符串(例如“context-sensitive”),使用语言分析器会生成以下标记组合:“context”、“sensitive”、“context-sensitive”。For example, for a hyphenated string like "context-sensitive", using a language analyzer will result in these token combinations: "context", "sensitive", "context-sensitive". 如果使用的是标准 Lucene 分析器,则带连字符的字符串不存在。Had you used the standard Lucene analyzer, the hyphenated string would not exist.

评估分析器时,考虑使用分析文本 API 来深入了解如何标记化字词并随后指定其前缀。When evaluating analyzers, consider using the Analyze Text API for insight into how terms are tokenized and subsequently prefixed. 生成索引后,可以尝试对字符串运行各种分析器,以查看标记输出。Once you build an index, you can try various analyzers on a string to view token output.

使用自定义分析器预定义分析器(标准 Lucene 除外)的字段被明确禁止,这样是为了防止结果不佳。Fields that use custom analyzers or predefined analyzers (with the exception of standard Lucene) are explicitly disallowed to prevent poor outcomes.

备注

如果需要解决分析器约束,例如,如果需要为某些查询方案使用某个关键字或 ngram 分析器,应对同一内容使用两个单独的字段。If you need to work around the analyzer constraint, for example if you need a keyword or ngram analyzer for certain query scenarios, you should use two separate fields for the same content. 这样,就可以在其中一个字段中使用建议器,并使用自定义分析器配置来设置其他字段。This will allow one of the fields to have a suggester, while the other can be set up with a custom analyzer configuration.

何时创建建议器When to create a suggester

创建建议器的最佳时间是同时要创建字段定义本身时。The best time to create a suggester is when you are also creating the field definition itself.

如果尝试使用预先存在的字段创建建议器,API 将不允许这样做。If you try to create a suggester using pre-existing fields, the API will disallow it. 在编制索引期间,如果两个或更多个字符的组合中的部分字词连同完整字词一起标记化,则会生成前缀。Prefixes are generated during indexing, when partial terms in two or more character combinations are tokenized alongside whole terms. 如果现有字段已标记化,而你想要将其添加到建议器,则必须重新生成索引。Given that existing fields are already tokenized, you will have to rebuild the index if you want to add them to a suggester. 有关详细信息,请参阅如何重新生成 Azure 认知搜索索引For more information, see How to rebuild an Azure Cognitive Search index.

使用 REST 进行创建Create using REST

在 REST API 中,通过创建索引更新索引添加建议器。In the REST API, add suggesters through Create Index or Update Index.

{
  "name": "hotels-sample-index",
  "fields": [
    . . .
        {
            "name": "HotelName",
            "type": "Edm.String",
            "facetable": false,
            "filterable": false,
            "key": false,
            "retrievable": true,
            "searchable": true,
            "sortable": false,
            "analyzer": "en.microsoft",
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "synonymMaps": [],
            "fields": []
        },
  ],
  "suggesters": [
    {
      "name": "sg",
      "searchMode": "analyzingInfixMatching",
      "sourceFields": ["HotelName"]
    }
  ],
  "scoringProfiles": [
    . . .
  ]
}

使用 .NET 进行创建Create using .NET

在 C# 中定义建议器对象In C#, define a Suggester object. Suggesters 是一个集合,但它只能采用一个项。Suggesters is a collection but it can only take one item.

private static void CreateHotelsIndex(SearchServiceClient serviceClient)
{
    var definition = new Index()
    {
        Name = "hotels-sample-index",
        Fields = FieldBuilder.BuildForType<Hotel>(),
        Suggesters = new List<Suggester>() {new Suggester()
            {
                Name = "sg",
                SourceFields = new string[] { "HotelName", "Category" }
            }}
    };

    serviceClient.Indexes.Create(definition);

}

属性参考Property reference

属性Property 说明Description
name 建议器的名称。The name of the suggester.
searchMode 用于搜索候选短语的策略。The strategy used to search for candidate phrases. 目前支持的唯一模式是 analyzingInfixMatching,该模式目前匹配字词的开头。The only mode currently supported is analyzingInfixMatching, which currently matches on the beginning of a term.
sourceFields 作为建议内容源的一个或多个字段的列表。A list of one or more fields that are the source of the content for suggestions. 字段的类型必须是 Edm.StringCollection(Edm.String)Fields must be of type Edm.String and Collection(Edm.String). 如果在字段中指定某个分析器,该分析器必须是此列表中指定的分析器(而不是自定义分析器)。If an analyzer is specified on the field, it must be a named analyzer from this list (not a custom analyzer).

作为最佳做法,请仅指定有助于生成预期相应响应的字段,无论该响应是搜索栏还是下拉列表中的已完成字符串。As a best practice, specify only those fields that lend themselves to an expected and appropriate response, whether it's a completed string in a search bar or a dropdown list.

酒店名称就是一很好的候选项,因为它很精确。A hotel name is a good candidate because it has precision. 说明和注释等详细字段过于密集。Verbose fields like descriptions and comments are too dense. 同样,类别和标记等重复性字段的效率较低。Similarly, repetitive fields, such as categories and tags, are less effective. 在示例中,我们仍然包含了“category”来演示可以包含多个字段。In the examples, we include "category" anyway to demonstrate that you can include multiple fields.

使用建议器Use a suggester

在查询中使用建议器。A suggester is used in a query. 创建建议器后,请调用以下 API 之一来实现“键入时搜索”体验:After a suggester is created, call one of the following APIs for a search-as-you-type experience:

在搜索应用程序中,客户端代码应利用 jQuery UI Autocomplete 之类的库来收集部分查询并提供匹配项。In a search application, client code should leverage a library like jQuery UI Autocomplete to collect the partial query and provide the match. 有关此任务的详细信息,请参阅将“自动完成”或“建议结果”功能添加到客户端代码For more information about this task, see Add autocomplete or suggested results to client code.

以下对自动完成 REST API 的调用演示了 API 的用法。API usage is illustrated in the following call to the Autocomplete REST API. 此示例有两个要点。There are two takeaways from this example. 首先,与所有查询一样,操作是针对索引的文档集合执行的,查询包含一个 search 参数,在本例中该参数提供部分查询****。First, as with all queries, the operation is against the documents collection of an index and the query includes a search parameter, which in this case provides the partial query. 其次,必须将 suggesterName 添加到请求****。Second, you must add suggesterName to the request. 如果未在索引中定义建议器,对自动完成或建议的调用将会失败。If a suggester is not defined in the index, a call to autocomplete or suggestions will fail.

POST /indexes/myxboxgames/docs/autocomplete?search&api-version=2020-06-30
{
  "search": "minecraf",
  "suggesterName": "sg"
}

代码示例Sample code

  • 在 C# 中创建第一个应用(第 3 课 - 添加“键入时搜索”)示例演示了建议器的构造、建议的查询、自动完成和分面导航。Create your first app in C# (lesson 3 - Add search-as-you-type) sample demonstrates a suggester construction, suggested queries, autocomplete, and faceted navigation. 此代码示例在沙盒 Azure 认知搜索服务中运行,并使用预先加载的酒店索引,因此,只需按 F5 即可运行应用程序。This code sample runs on a sandbox Azure Cognitive Search service and uses a pre-loaded Hotels index so all you have to do is press F5 to run the application. 无需订阅或登录。No subscription or sign-in is necessary.

  • DotNetHowToAutocomplete 是包含 C# 和 Java 代码的早期示例。DotNetHowToAutocomplete is an older sample containing both C# and Java code. 其中也演示了建议器的构造、建议的查询、自动完成和分面导航。It also demonstrates a suggester construction, suggested queries, autocomplete, and faceted navigation. 此代码示例使用托管的 NYCJobs 示例数据。This code sample uses the hosted NYCJobs sample data.

后续步骤Next steps

建议参阅以下文章来详细了解如何请求表述。We recommend the following article to learn more about how requests formulation.