命名实体识别认知技能Named Entity Recognition cognitive skill
命名实体识别 技能可以从文本中提取命名实体。The Named Entity Recognition skill extracts named entities from text. 可用实体包括 person
、location
和 organization
类型。Available entities include the types person
, location
and organization
.
重要
已命名的实体识别技能现已弃用,替换为 Microsoft.Skills.Text.EntityRecognitionSkill。Named entity recognition skill is now discontinued replaced by Microsoft.Skills.Text.EntityRecognitionSkill. 已于 2019 年 2 月 15 日停止支持,并且已于 2019 年 5 月 2 日将此 API 从产品中删除。Support stopped on February 15, 2019 and the API was removed from the product on May 2, 2019. 建议按照已弃用的认知搜索技能来迁移到受支持的技能。Follow the recommendations in Deprecated cognitive search skills to migrate to a supported skill.
备注
通过增大处理频率、添加更多文档或添加更多 AI 算法来扩大范围时,需要附加可计费的认知服务资源。As you expand scope by increasing the frequency of processing, adding more documents, or adding more AI algorithms, you will need to attach a billable Cognitive Services resource. 调用认知服务中的 API 以及在 Azure 认知搜索中的文档破解阶段提取图像时,会产生费用。Charges accrue when calling APIs in Cognitive Services, and for image extraction as part of the document-cracking stage in Azure Cognitive Search. 提取文档中的文本不会产生费用。There are no charges for text extraction from documents.
当你行使内置技能时,我们会按现有的认知服务预付费价格收费。Execution of built-in skills is charged at the existing Cognitive Services pay-in-advance price. 图像提取定价如 Azure 认知搜索定价页所述。Image extraction pricing is described on the Azure Cognitive Search pricing page.
@odata.type
Microsoft.Skills.Text.NamedEntityRecognitionSkillMicrosoft.Skills.Text.NamedEntityRecognitionSkill
数据限制Data limits
记录的最大大小应为 50,000 个字符,通过 String.Length
进行测量。The maximum size of a record should be 50,000 characters as measured by String.Length
. 如果在将数据发送到关键短语提取器之前需要拆分数据,请使用文本拆分技能。If you need to break up your data before sending it to the key phrase extractor, consider using the Text Split skill.
技能参数Skill parameters
参数区分大小写。Parameters are case-sensitive.
参数名称Parameter name | 说明Description |
---|---|
categoriescategories | 应提取的类别的数组。Array of categories that should be extracted. 可能类别类型有:"Person" 、"Location" 、"Organization" 。Possible category types: "Person" , "Location" , "Organization" . 如果不提供类别,则返回所有类型。If no category is provided, all types are returned. |
defaultLanguageCodedefaultLanguageCode | 输入文本的语言代码。Language code of the input text. 支持以下语言:de, en, es, fr, it The following languages are supported: de, en, es, fr, it |
minimumPrecisionminimumPrecision | 介于 0 和 1 之间的数字。A number between 0 and 1. 如果精度低于此值,则不会返回该实体。If the precision is lower than this value, the entity is not returned. 默认值为 0。The default is 0. |
技能输入Skill inputs
输入名称Input name | 说明Description |
---|---|
languageCodelanguageCode | 可选。Optional. 默认为 "en" 。Default is "en" . |
texttext | 要分析的文本。The text to analyze. |
技能输出Skill outputs
输出名称Output name | 说明Description |
---|---|
人员persons | 一个字符串数组,其中,一个字符串表示一个人员名称。An array of strings where each string represents the name of a person. |
位置locations | 一个字符串数组,其中,一个字符串表示一个位置。An array of strings where each string represents a location. |
组织organizations | 一个字符串数组,其中,一个字符串表示一个组织。An array of strings where each string represents an organization. |
实体entities | 一个复杂类型数组。An array of complex types. 每个复杂类型都包含以下字段:Each complex type includes the following fields:
|
示例定义Sample definition
{
"@odata.type": "#Microsoft.Skills.Text.NamedEntityRecognitionSkill",
"categories": [ "Person", "Location", "Organization"],
"defaultLanguageCode": "en",
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "persons",
"targetName": "people"
}
]
}
示例输入Sample input
{
"values": [
{
"recordId": "1",
"data":
{
"text": "This is the loan application for Joe Romero, a Microsoft employee who was born in Chile and who then moved to Australia… Ana Smith is provided as a reference.",
"languageCode": "en"
}
}
]
}
示例输出Sample output
{
"values": [
{
"recordId": "1",
"data" :
{
"persons": [ "Joe Romero", "Ana Smith"],
"locations": ["Chile", "Australia"],
"organizations":["Microsoft"],
"entities":
[
{
"category":"person",
"value": "Joe Romero",
"offset": 33,
"confidence": 0.87
},
{
"category":"person",
"value": "Ana Smith",
"offset": 124,
"confidence": 0.87
},
{
"category":"location",
"value": "Chile",
"offset": 88,
"confidence": 0.99
},
{
"category":"location",
"value": "Australia",
"offset": 112,
"confidence": 0.99
},
{
"category":"organization",
"value": "Microsoft",
"offset": 54,
"confidence": 0.99
}
]
}
}
]
}
错误案例Error cases
如果文档的语言代码不受支持,则返回错误,并且不提取任何实体。If the language code for the document is unsupported, an error is returned and no entities are extracted.