LUIS 应用中的短语列表特征Phrase list features in your LUIS app

在机器学习中,特征是系统观察到的数据的特征或特有属性。In machine learning, a feature is a distinguishing trait or attribute of data that your system observes.

将特征添加到语言模型,提供有关如何识别需要标记或分类的输入的提示。Add features to a language model to provide hints about how to recognize input that you want to label or classify. 特征帮助 LUIS 识别意向和实体,但特征本身不是意向或实体。Features help LUIS recognize both intents and entities, but features are not intents or entities themselves. 相反,特征可能提供相关术语的示例。Instead, features might provide examples of related terms.

短语列表特征是什么?What is a phrase list feature?

短语列表是比言语中其他字词对应用更重要的字词或短语的列表。A phrase list is a list of words or phrases that are significant to your app, more so than other words in utterances. 短语列表作为这些字词的附加 LUIS 信号,添加到应用域的词汇中。A phrase list adds to the vocabulary of the app domain as an additional signal to LUIS about those words. LUIS 对其中一个值的了解也自动应用到其他值。What LUIS learns about one of them is automatically applied to the others as well. 此列表不是完全文本匹配的封闭式列表实体This list is not a closed list entity of exact text matches.

短语列表没有词干分解作用,因此需要添加对任何重要词汇字词和短语使用各种词干分解的言语示例。Phrase lists do not help with stemming so you need to add utterance examples that use a variety of stemming for any significant vocabulary words and phrases.

短语列表可以帮助所有模型Phrase lists help all models

短语列表虽未与特定意向或实体关联,但却作为重要增强项添加到所有意向和实体。Phrase lists are not linked to a specific intent or entity but are added as a significant boost to all the intents and entities. 其目的是改进意向检测和实体分类。Its purpose is to improve intent detection and entity classification.

如何使用短语列表How to use phrase lists

创建短语列表的情况为,应用有对应用重要的字词或短语,如:Create a phrase list when your app has words or phrases that are important to the app such as:

  • 行业术语industry terms
  • 俚语slang
  • 缩写abbreviations
  • 公司专用语言company-specific language
  • 虽来自另一种语言,但在应用中经常使用的语言language that is from another language but frequently used in your app
  • 示例言语中的关键字和短语key words and phrases in your example utterances

输入几个字词或短语后,立即使用“建议”功能来查找相关值。Once you've entered a few words or phrases, use the Recommend feature to find related values. 先检查相关值,再添加到短语列表值。Review the related values before adding to your phrase list values.

列表类型List type 目的Purpose
可互换Interchangeable 将其更换为列表中的其他字词后具有相同意向和实体提取的同义词或字词。Synonyms or words that, when changed to another word in the list, have the same intent, and entity extraction.
不可互换Non-interchangeable 相对于该语言中的其他通用字词,更特定于应用的应用词汇。App vocabulary, specific to your app, more so than generally other words in that language.

可交换列表Interchangeable lists

可交换短语列表适用于作为同义词的值。An interchangeable phrase list is for values that are synonyms. 例如,如果希望找到所有水体,且有示例言语,如:For example, if you want all bodies of water found and you have example utterances such as:

  • 什么城市靠近五大湖?What cities are close to the Great Lakes?
  • 什么道路沿着哈瓦苏湖城走?What road runs along Lake Havasu?
  • 尼罗河的起点和终点在哪里?Where does the Nile start and end?

应确定每个言语的意向和实体,而不考虑水体:Each utterance should be determined for both intent and entities regardless of body of water:

  • 什么城市靠近 [bodyOfWater]?What cities are close to [bodyOfWater]?
  • 什么道路沿着 [bodyOfWater] 走?What road runs along [bodyOfWater]?
  • [bodyOfWater] 的起点和终点在哪里?Where does the [bodyOfWater] start and end?

由于水体的字词或短语是同义词,并能在言语中交换使用,因此对短语列表使用“可交换”设置。Because the words or phrases for the body of water are synonymous and can be used interchangeably in the utterances, use the Interchangeable setting on the phrase list.

不可交换列表Non-interchangeable lists

不可交换短语列表是增强 LUIS 检测的信号。A non-interchangeable phrase list is a signal that boosts detection to LUIS. 短语列表代表比其他字词更重要的字词或短语。The phrase list indicates words or phrases that are more significant that other words. 这有助于确定意向和实体检测。This helps with both determining intent and entity detection. 例如,假设有全局主题域(即跨区域性,但仍为一种语言),如旅行。For example, say you have a subject domain like travel that is global (meaning across cultures but still in a single language). 虽有对应用重要的字词和短语,但它们不是同义词。There are words and phrases that are important to the app but are not synonymous.

再比如,对罕见词、专有词和外来词使用不可交换短语列表。For another example, use a non-interchangeable phrase list for rare, proprietary, and foreign words. LUIS 可能无法识别罕见词、专有词以及外来词(在应用区域性以外)。LUIS may be unable to recognize rare and proprietary words, as well as foreign words (outside of the culture of the app). 不可互换设置指示罕见字词集组成 LUIS 应学会识别的类,但它们不是同义词,也不能彼此互换。The non-interchangeable setting indicates that the set of rare words forms a class that LUIS should learn to recognize, but they are not synonyms or interchangeable with each other.

不要将每个可能的字词或短语都添加到短语列表,一次添加几个字词或短语,再重新训练和发布。Do not add every possible word or phrase to a phrase list, add a few words or phrases at a time, then retrain and publish.

随着短语列表随时间越来越长,可能会发现一些术语有许多种形式(同义词)。As the phrase list grows over time, you may find some terms have many forms (synonyms). 将它们分入另一个可互换的短语列表。Break these out into another phrase list that is interchangeable.

短语列表有助于识别简单的可互换实体Phrase lists help identify simple Interchangeable entities

可互换短语列表是优化 LUIS 应用性能的一种好方法。Interchangeable phrase lists are a good way to tune the performance of your LUIS app. 如果应用在预测正确意向的表述或识别实体时存在困难,请思考表述是否包含异常字词,或者包含含义不明的字词。If your app has trouble predicting utterances to the correct intent, or recognizing entities, think about whether the utterances contain unusual words, or words that might be ambiguous in meaning. 这些词是列入短语列表的优秀候选词。These words are good candidates to include in a phrase list.

短语列表有助于通过更好地理解上下文来识别意向Phrase lists help identify intents by better understanding context

短语列表并非指示 LUIS 执行严格匹配或始终将短语列表中的所有术语标记为完全相同的指令。A phrase list is not an instruction to LUIS to perform strict matching or always label all terms in the phrase list exactly the same. 它只是一个提示。It is simply a hint. 例如,短语列表可以将“Patti”和“Selma”指示为姓名,但 LUIS 仍然可以使用上下文信息识别它们在“Make a reservation for 2 at Patti's Diner for dinner”和“Find me driving directions to Selma, Georgia”中意义不同。For example, you could have a phrase list that indicates that "Patti" and "Selma" are names, but LUIS can still use contextual information to recognize that they mean something different in "Make a reservation for 2 at Patti's Diner for dinner" and "Find me driving directions to Selma, Georgia".

添加短语列表是将更多示例表述添加到意向的替代方法。Adding a phrase list is an alternative to adding more example utterances to an intent.

何时使用短语列表与列表实体When to use phrase lists versus list entities

尽管短语列表和列表实体都可以影响所有意向中的表述,但各自实现的方式不同。While both a phrase list and list entities can impact utterances across all intents, each does this in a different way. 短语列表用于影响意向预测评分。Use a phrase list to affect intent prediction score. 列表实体用于影响完全文本匹配的实体提取。Use a list entity to affect entity extraction for an exact text match.

使用短语列表Use a phrase list

有了短语列表,LUIS 仍可以考虑上下文并进行归纳,从而标识与列表项相似但并非完全匹配的项。With a phrase list, LUIS can still take context into account and generalize to identify items that are similar to, but not an exact match, as items in a list. 如果需要 LUIS 应用能够归纳和识别分类中的新项,请使用短语列表。If you need your LUIS app to be able to generalize and identify new items in a category, use a phrase list.

如果想要能够识别实体的新实例(例如:应识别新联系人姓名的会议计划程序、应识别新产品的库存应用),请使用另一类型的机器学习到的实体,例如简单实体。When you want to be able to recognize new instances of an entity, like a meeting scheduler that should recognize the names of new contacts, or an inventory app that should recognize new products, use another type of machine-learned entity such as a simple entity. 然后,创建字词和短语的短语列表,有助于 LUIS 查找与实体相似的其他字词。Then create a phrase list of words and phrases that helps LUIS find other words similar to the entity. 此列表通过增加这些字词的价值来指导 LUIS 识别实体的示例。This list guides LUIS to recognize examples of the entity by adding additional significance to the value of those words.

短语列表就像特定域的词汇表,有助于提高意向和实体的理解质量。Phrase lists are like domain-specific vocabulary that help with enhancing the quality of understanding of both intents and entities. 短语列表的常见用法是专有名词,例如城市名。A common usage of a phrase list is proper nouns such as city names. 城市名称可以是包括连字符或撇号的多个字词。A city name can be several words including hyphens, or apostrophes.

请勿使用短语列表Don't use a phrase list

列表实体显式定义实体可以采用的每个值,并仅标识完全匹配的值。A list entity explicitly defines every value an entity can take, and only identifies values that match exactly. 列表实体可能适用于其中某实体的所有实例已知且不常更改的应用。A list entity may be appropriate for an app in which all instances of an entity are known and don't change often. 例如餐厅菜单上不常更改的菜品项。Examples are food items on a restaurant menu that changes infrequently. 如果需要实体的完全文本匹配,请勿使用短语列表。If you need an exact text match of an entity, do not use a phrase list.

最佳实践Best practices

了解最佳实践Learn best practices.

后续步骤Next steps

请参阅添加功能,了解有关如何将特征添加到 LUIS 应用的详细信息。See Add Features to learn more about how to add features to your LUIS app.