示例:使用 C# 为 Azure 认知搜索添加同义词Example: Add synonyms for Azure Cognitive Search in C#

使用同义词进行搜索时,在语义上被视为与输入字词等效的字词也属于匹配项,从而扩展了查询的范围。Synonyms expand a query by matching on terms considered semantically equivalent to the input term. 例如,使用“car”进行搜索时,也可将包含“automobile”或“vehicle”字词的文档视为匹配项。For example, you might want "car" to match documents containing the terms "automobile" or "vehicle".

在 Azure 认知搜索中,同义词在“同义词映射”中通过可将等效字词关联在一起的“映射规则”进行定义。 In Azure Cognitive Search, synonyms are defined in a synonym map, through mapping rules that associate equivalent terms. 本示例介绍了用于添加同义词并将其与现有索引一起使用的基本步骤。This example covers essential steps for adding and using synonyms with an existing index. 你将学习如何执行以下操作:You learn how to:

  • 使用 SynonymMap 类创建同义词映射。Create a synonym map using the SynonymMap class.
  • 在应该支持通过同义词进行查询扩展的字段上设置 SynonymMaps 属性。Set the SynonymMaps property on fields that should support query expansion via synonyms.

可以照常查询启用了同义词的字段。You can query a synonym-enabled field as you would normally. 访问同义词不需其他的查询语法。There is no additional query syntax required to access synonyms.

可以创建多个同义词映射,将其发布为在服务范围内可供任何索引使用的资源,然后引用要在字段级别使用的资源。You can create multiple synonym maps, post them as a service-wide resource available to any index, and then reference which one to use at the field level. 在查询时,除了搜索索引,Azure 认知搜索还会查看同义词映射(如果在查询所用的字段上指定了该映射)。At query time, in addition to searching an index, Azure Cognitive Search does a lookup in a synonym map, if one is specified on fields used in the query.

备注

可以通过编程方式创建同义词,但不能在门户中进行。Synonyms can be created programmatically, but not in the portal. 如果 Azure 门户支持同义词将对你很有用,请在 UserVoice 上提供反馈If Azure portal support for synonyms would be useful to you, please provide your feedback on the UserVoice

先决条件Prerequisites

教程要求如下:Tutorial requirements include the following:

概述Overview

前后比较查询可演示同义词的值。Before-and-after queries demonstrate the value of synonyms. 在本示例中,我们使用一个示例应用程序,针对示例索引执行查询并返回结果。In this example, use a sample application that executes queries and returns results on a sample index. 示例应用程序创建了一个名为“hotels”的小索引,其中填充了两个文档。The sample application creates a small index named "hotels" populated with two documents. 该应用程序使用未在索引中出现的字词和短语执行搜索查询,接着启用同义词功能,再次进行相同的搜索。The application executes search queries using terms and phrases that do not appear in the index, enables the synonyms feature, then issues the same searches again. 以下代码演示了整个流程。The code below demonstrates the overall flow.

  static void Main(string[] args)
  {
      SearchServiceClient serviceClient = CreateSearchServiceClient();

      Console.WriteLine("{0}", "Cleaning up resources...\n");
      CleanupResources(serviceClient);

      Console.WriteLine("{0}", "Creating index...\n");
      CreateHotelsIndex(serviceClient);

      ISearchIndexClient indexClient = serviceClient.Indexes.GetClient("hotels");

      Console.WriteLine("{0}", "Uploading documents...\n");
      UploadDocuments(indexClient);

      ISearchIndexClient indexClientForQueries = CreateSearchIndexClient();

      RunQueriesWithNonExistentTermsInIndex(indexClientForQueries);

      Console.WriteLine("{0}", "Adding synonyms...\n");
      UploadSynonyms(serviceClient);
      EnableSynonymsInHotelsIndex(serviceClient);
      Thread.Sleep(10000); // Wait for the changes to propagate

      RunQueriesWithNonExistentTermsInIndex(indexClientForQueries);

      Console.WriteLine("{0}", "Complete.  Press any key to end application...\n");

      Console.ReadKey();
  }

有关如何创建和填充示例索引的步骤,请参阅如何使用 .NET 应用程序中的 Azure 认知搜索The steps to create and populate the sample index are explained in How to use Azure Cognitive Search from a .NET Application.

“启用前”查询"Before" queries

RunQueriesWithNonExistentTermsInIndex 中,使用“five star”、“internet”和“economy AND hotel”发出搜索查询。In RunQueriesWithNonExistentTermsInIndex, issue search queries with "five star", "internet", and "economy AND hotel".

Console.WriteLine("Search the entire index for the phrase \"five star\":\n");
results = indexClient.Documents.Search<Hotel>("\"five star\"", parameters);
WriteDocuments(results);

Console.WriteLine("Search the entire index for the term 'internet':\n");
results = indexClient.Documents.Search<Hotel>("internet", parameters);
WriteDocuments(results);

Console.WriteLine("Search the entire index for the terms 'economy' AND 'hotel':\n");
results = indexClient.Documents.Search<Hotel>("economy AND hotel", parameters);
WriteDocuments(results);

两个已编入索引的文档都不包含这些字词,因此首次 RunQueriesWithNonExistentTermsInIndex 获得的输出如下。Neither of the two indexed documents contain the terms, so we get the following output from the first RunQueriesWithNonExistentTermsInIndex.

Search the entire index for the phrase "five star":

no document matched

Search the entire index for the term 'internet':

no document matched

Search the entire index for the terms 'economy' AND 'hotel':

no document matched

启用同义词Enable synonyms

启用同义词是一个两步过程。Enabling synonyms is a two-step process. 首先定义并上传同义词规则,然后将字段配置为使用这些规则。We first define and upload synonym rules and then configure fields to use them. UploadSynonymsEnableSynonymsInHotelsIndex 中概述了此过程。The process is outlined in UploadSynonyms and EnableSynonymsInHotelsIndex.

  1. 将同义词映射添加到搜索服务。Add a synonym map to your search service. UploadSynonyms 中,我们在同义词映射“desc-synonymmap”中定义了四条规则,并将其上传到服务。In UploadSynonyms, we define four rules in our synonym map 'desc-synonymmap' and upload to the service.

     var synonymMap = new SynonymMap()
     {
         Name = "desc-synonymmap",
         Format = "solr",
         Synonyms = "hotel, motel\n
                     internet,wifi\n
                     five star=>luxury\n
                     economy,inexpensive=>budget"
     };
    
     serviceClient.SynonymMaps.CreateOrUpdate(synonymMap);
    

    同义词映射必须符合开源标准 solr 格式。A synonym map must conform to the open source standard solr format. 该格式在 Azure 认知搜索中的同义词Apache Solr synonym format部分进行了说明。The format is explained in Synonyms in Azure Cognitive Search under the section Apache Solr synonym format.

  2. 将可搜索字段配置为允许在索引定义中使用同义词映射。Configure searchable fields to use the synonym map in the index definition. EnableSynonymsInHotelsIndex 中,我们对categorytags这两个字段启用了同义词功能,方法是将 synonymMaps 属性设置为新上传的同义词映射的名称。In EnableSynonymsInHotelsIndex, we enable synonyms on two fields category and tags by setting the synonymMaps property to the name of the newly uploaded synonym map.

    Index index = serviceClient.Indexes.Get("hotels");
    index.Fields.First(f => f.Name == "category").SynonymMaps = new[] { "desc-synonymmap" };
    index.Fields.First(f => f.Name == "tags").SynonymMaps = new[] { "desc-synonymmap" };
    
    serviceClient.Indexes.CreateOrUpdate(index);
    

    添加同义词映射时,索引重建不是必需的。When you add a synonym map, index rebuilds are not required. 可以向服务添加同义词映射,并对任意索引中的现有字段定义进行修正,使之使用新的同义词映射。You can add a synonym map to your service, and then amend existing field definitions in any index to use the new synonym map. 添加新属性不影响索引可用性。The addition of new attributes has no impact on index availability. 这同样适用于禁用字段同义词功能的情况。The same applies in disabling synonyms for a field. 可以直接将 synonymMaps 属性设置为空列表。You can simply set the synonymMaps property to an empty list.

    index.Fields.First(f => f.Name == "category").SynonymMaps = new List<string>();
    

“启用后”查询"After" queries

上传同义词映射并对索引进行更新,允许其使用同义词映射以后,再次调用 RunQueriesWithNonExistentTermsInIndex 会获得如下输出:After the synonym map is uploaded and the index is updated to use the synonym map, the second RunQueriesWithNonExistentTermsInIndex call outputs the following:

Search the entire index for the phrase "five star":

Name: Fancy Stay        Category: Luxury        Tags: [pool, view, wifi, concierge]

Search the entire index for the term 'internet':

Name: Fancy Stay        Category: Luxury        Tags: [pool, view, wifi, concierge]

Search the entire index for the terms 'economy' AND 'hotel':

Name: Roach Motel       Category: Budget        Tags: [motel, budget]

第一个查询根据规则 five star=>luxury 查找文档。The first query finds the document from the rule five star=>luxury. 第二个查询使用 internet,wifi 扩展了搜索,第三个查询在查找匹配的文档时同时使用 hotel, moteleconomy,inexpensive=>budgetThe second query expands the search using internet,wifi and the third using both hotel, motel and economy,inexpensive=>budget in finding the documents they matched.

添加同义词完全改观了搜索体验。Adding synonyms completely changes the search experience. 在本示例中,初始查询并没有返回有意义的结果,虽然索引中的文档是相关的。In this example, the original queries failed to return meaningful results even though the documents in our index were relevant. 启用同义词可以扩展索引,使之包括常用字词,但不更改索引中的基础数据。By enabling synonyms, we can expand an index to include terms in common use, with no changes to underlying data in the index.

示例应用程序源代码Sample application source code

可以在 GitHub 上找到本演练中所用示例应用程序的完整源代码。You can find the full source code of the sample application used in this walk through on GitHub.

清理资源Clean up resources

完成本示例后,最快的清理方式是删除包含 Azure 认知搜索服务的资源组。The fastest way to clean up after an example is by deleting the resource group containing the Azure Cognitive Search service. 现在,可以删除资源组以永久删除其中的所有内容。You can delete the resource group now to permanently delete everything in it. 在门户中,资源组名称显示在 Azure 认知搜索服务的“概述”页上。In the portal, the resource group name is on the Overview page of Azure Cognitive Search service.

后续步骤Next steps

本示例使用 C# 代码演示了同义词功能,创建并发布了映射规则,然后在查询中调用同义词映射。This example demonstrated the synonyms feature in C# code to create and post mapping rules and then call the synonym map on a query. 可以在 .NET SDKREST API 参考文档中找到更多信息。Additional information can be found in the .NET SDK and REST API reference documentation.