整形程序认知技能Shaper cognitive skill

“整形程序”技能将多个输入整合成以后可在扩充管道中引用的复杂类型The Shaper skill consolidates several inputs into a complex type that can be referenced later in the enrichment pipeline. 借助整形程序技能,可实质上创建结构、定义该结构的成员名称,并为每个成员分配值。The Shaper skill allows you to essentially create a structure, define the name of the members of that structure, and assign values to each member. 搜索方案中有用的合并字段示例包括将姓和名合并成单个结构、将城市和州合并成单个结构、或者将姓名和出生日期合并成单个结构,从而建立唯一标识。Examples of consolidated fields useful in search scenarios include combining a first and last name into a single structure, city and state into a single structure, or name and birthdate into a single structure to establish unique identity.

此外,方案 3 中展示的整形程序技能为输入添加了一个可选的 sourceContext 属性。Additionally, the Shaper skill illustrated in scenario 3 adds an optional sourceContext property to the input. sourcesourceContext 属性是互斥的。The source and sourceContext properties are mutually exclusive. 如果输入位于技能上下文中,则只需使用 sourceIf the input is at the context of the skill, simply use source. 如果输入所在的上下文与技能上下文不同,则使用 sourceContextIf the input is at a different context than the skill context, use the sourceContext. sourceContext 要求使用寻址为源的特定元素定义嵌套的输入。The sourceContext requires you to define a nested input with the specific element being addressed as the source.

输出名称始终为“output”。The output name is always "output". 管道可在内部映射不同的名称,例如下图所示的“analyzedText”,但“整形程序”技能本身会在响应中返回“output”。Internally, the pipeline can map a different name, such as "analyzedText" as shown in the examples below, but the Shaper skill itself returns "output" in the response. 如果正在调试大量文档并发现存在命名差异,或者要生成自定义技能并自行构建响应,这一点非常重要。This might be important if you are debugging enriched documents and notice the naming discrepancy, or if you build a custom skill and are structuring the response yourself.

备注

“整形程序”技能未绑定到认知服务 API,使用它无需付费。The Shaper skill is not bound to a Cognitive Services API and you are not charged for using it. 但是,你仍然应该附加认知服务资源,以覆盖免费资源选项,该选项限制你每天进行少量的每日扩充。You should still attach a Cognitive Services resource, however, to override the Free resource option that limits you to a small number of daily enrichments per day.

@odata.type

Microsoft.Skills.Util.ShaperSkillMicrosoft.Skills.Util.ShaperSkill

方案 1:复杂类型Scenario 1: complex types

请思考这样一种情况:想要创建名为 analyzedText 的结构,该结构具有两个成员:分别为 text 和 sentiment 。Consider a scenario where you want to create a structure called analyzedText that has two members: text and sentiment, respectively. 在索引中,可搜索的多部分字段称为“复杂类型”,它通常是当源数据具有映射到它的相应复杂结构时创建的。In an index, a multi-part searchable field is called a complex type and it's often created when source data has a corresponding complex structure that maps to it.

但是,创建复杂类型的另一种方法是通过“整形程序”技能。However, another approach for creating complex types is through the Shaper skill. 通过在技能集中包含此技能,在技能集处理期间执行的内存中操作可以输出采用嵌套结构的数据形状,而此形状随后可映射到索引中的复杂类型。By including this skill in a skillset, the in-memory operations during skillset processing can output data shapes with nested structures, which can then be mapped to a complex type in your index.

以下示例技能定义提供成员名称作为输入。The following example skill definition provides the member names as the input.

{
  "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
  "context": "/document/content/phrases/*",
  "inputs": [
    {
      "name": "text",
      "source": "/document/content/phrases/*"
    },
    {
      "name": "sentiment",
      "source": "/document/content/phrases/*/sentiment"
    }
  ],
  "outputs": [
    {
      "name": "output",
      "targetName": "analyzedText"
    }
  ]
}

示例索引Sample index

技能集由索引器调用,索引器需要索引。A skillset is invoked by an indexer, and an indexer requires an index. 索引中的复杂字段表示形式可能如以下示例所示。A complex field representation in your index might look like the following example.


    "name": "my-index",
    "fields": [
        {   "name": "myId", "type": "Edm.String", "key": true, "filterable": true   },
        {   "name": "analyzedText", "type": "Edm.ComplexType",
            "fields": [{
                    "name": "text",
                    "type": "Edm.String",
                    "filterable": false,
                    "sortable": false,
                    "facetable": false,
                    "searchable": true  },
          {
                    "name": "sentiment",
                    "type": "Edm.Double",
                    "searchable": true,
                    "filterable": true,
                    "sortable": true,
                    "facetable": true
                },

技能输入Skill input

为此“整形程序”技能提供可用输入的传入 JSON 文档可能如下所示:An incoming JSON document providing usable input for this Shaper skill could be:

{
    "values": [
        {
            "recordId": "1",
            "data": {
                "text": "this movie is awesome",
                "sentiment": 0.9
            }
        }
    ]
}

技能输出Skill output

整形程序技能使用 textsentiment 组合元素生成一个名为 analyzedText 的新元素。The Shaper skill generates a new element called analyzedText with the combined elements of text and sentiment. 此输出符合索引架构。This output conforms to the index schema. 它将在 Azure 认知搜索索引中导入和编制索引。It will be imported and indexed in an Azure Cognitive Search index.

{
    "values": [
      {
        "recordId": "1",
        "data":
           {
            "analyzedText": 
              {
                "text": "this movie is awesome" ,
                "sentiment": 0.9
              }
           }
      }
    ]
}

方案 2:输入整合Scenario 2: input consolidation

在另一个示例中,假设处于管道处理的不同阶段,已提取书名以及该书不同页面上的章节标题。In another example, imagine that at different stages of pipeline processing, you have extracted the title of a book, and chapter titles on different pages of the book. 现在可创建由这些不同输入组成的单个结构。You could now create a single structure composed of these various inputs.

此方案的“整形程序”技能定义可能如以下示例所示:The Shaper skill definition for this scenario might look like the following example:

{
    "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
    "context": "/document",
    "inputs": [
        {
            "name": "title",
            "source": "/document/content/title"
        },
        {
            "name": "chapterTitles",
            "source": "/document/content/pages/*/chapterTitles/*/title"
        }
    ],
    "outputs": [
        {
            "name": "output",
            "targetName": "titlesAndChapters"
        }
    ]
}

技能输出Skill output

在本例中,“整形程序”平整所有章节标题,以创建单个数组。In this case, the Shaper flattens all chapter titles to create a single array.

{
    "values": [
        {
            "recordId": "1",
            "data": {
                "titlesAndChapters": {
                    "title": "How to be happy",
                    "chapterTitles": [
                        "Start young",
                        "Laugh often",
                        "Eat, sleep and exercise"
                    ]
                }
            }
        }
    ]
}

方案 3:从嵌套的上下文进行输入整合Scenario 3: input consolidation from nested contexts

假设你有某个书籍的标题、章节和内容,并已针对内容中的关键短语运行实体识别,现在需要将不同技能的结果聚合成包含章节名称、实体和关键短语的单个形状。Imagine you have the title, chapters, and contents of a book and have run entity recognition and key phrases on the contents and now need to aggregate results from the different skills into a single shape with the chapter name, entities, and key phrases.

此方案的“整形程序”技能定义可能如以下示例所示:The Shaper skill definition for this scenario might look like the following example:

{
    "@odata.type": "#Microsoft.Skills.Util.ShaperSkill",
    "context": "/document",
    "inputs": [
        {
            "name": "title",
            "source": "/document/content/title"
        },
        {
            "name": "chapterTitles",
            "sourceContext": "/document/content/pages/*/chapterTitles/*",
            "inputs": [
              {
                  "name": "title",
                  "source": "/document/content/pages/*/chapterTitles/*/title"
              },
              {
                  "name": "number",
                  "source": "/document/content/pages/*/chapterTitles/*/number"
              }
            ]
        }

    ],
    "outputs": [
        {
            "name": "output",
            "targetName": "titlesAndChapters"
        }
    ]
}

技能输出Skill output

在本例中,“整形程序”会创建一个复杂类型。In this case, the Shaper creates a complex type. 此结构存在于内存中。This structure exists in-memory. 若要将其保存到知识存储,应在技能组中创建一个用于定义存储特征的投影。If you want to save it to a knowledge store, you should create a projection in your skillset that defines storage characteristics.

{
    "values": [
        {
            "recordId": "1",
            "data": {
                "titlesAndChapters": {
                    "title": "How to be happy",
                    "chapterTitles": [
                      { "title": "Start young", "number": 1},
                      { "title": "Laugh often", "number": 2},
                      { "title": "Eat, sleep and exercise", "number: 3}
                    ]
                }
            }
        }
    ]
}

另请参阅See also