使用主题域和数据提取规划 LUIS 应用架构Plan your LUIS app schema with subject domain and data extraction

LUIS 应用架构包含与主题 相关的意向实体A LUIS app schema contains intents and entities relevant to your subject domain. 意向对用户言语分类,实体从用户言语中提取数据。The intents classify user utterances, and the entities extract data from the user utterances.

标识域Identify your domain

LUIS 应用以主题域为中心。A LUIS app is centered around a subject domain. 例如,可能有一个用于预订门票、航班、酒店和租车的旅行应用。For example, you may have a travel app that handles booking of tickets, flights, hotels, and rental cars. 另一应用则用于提供与锻炼、跟踪健身活动和设定目标相关的内容。Another app may provide content related to exercising, tracking fitness efforts and setting goals. 标识域可帮助你查找与你的域相关的单词或短语。Identifying the domain helps you find words or phrases that are relevant to your domain.

提示

LUIS 提供许多常见场景的预生成域LUIS offers prebuilt domains for many common scenarios. 检查是否可使用预生成域作为应用的起始点。Check to see if you can use a prebuilt domain as a starting point for your app.

标识意向Identify your intents

考虑一下对应用程序的任务极为重要的意向Think about the intents that are important to your application's task.

以旅行应用为例,该应用可预订航班并检查用户目的地的天气。Let's take the example of a travel app, with functions to book a flight and check the weather at the user's destination. 可为这些操作定义 BookFlightGetWeather 意向。You can define the BookFlight and GetWeather intents for these actions.

对于具有更多功能的复杂应用,意向也就更多,应仔细进行定义,使意向不要太具体。In a more complex app with more functions, you have more intents, and you should define them carefully so the intents aren't too specific. 例如,BookFlightBookHotel 可能是单独的意向,但 BookInternationalFlightBookDomesticFlight 则过于相似。For example, BookFlight and BookHotel may need to be separate intents, but BookInternationalFlight and BookDomesticFlight may be too similar.

备注

最佳做法是仅使用所需意向来执行应用功能。It is a best practice to use only as many intents as you need to perform the functions of your app. 如果定义的意向过多,LUIS 将难以对话语进行正确分类。If you define too many intents, it becomes harder for LUIS to classify utterances correctly. 如果定义的太少,则言语可能太过笼统以致于重叠。If you define too few, they may be so general that they overlap.

如果不需标识整个用户意向,请将所有示例性的用户言语添加到 None 意向。If you don't need to identify overall user intention, add all the example user utterances to the None intent. 如果应用需要更多意向,可以在以后创建它们。If your app grows into needing more intents, you can create them later.

为每个意向创建示例陈述Create example utterances for each intent

首先,避免为每个意向创建太多言语。To begin with, avoid creating too many utterances for each intent. 确定了意向后,为每个意向创建 15 到 30 个示例言语。Once you have determined the intents, create 15 to 30 example utterances per intent. 每个言语应不同于前面提供的言语。Each utterance should be different from the previously provided utterances. 良好的言语样本包括总体字数统计、选词、动词时态和标点A good variety in utterances include overall word count, word choice, verb tense, and punctuation.

有关详细信息,请参阅了解适用于 LUIS 应用的言语For more information, see understanding good utterances for LUIS apps.

标识实体Identify your entities

在示例陈述中,标识要提取的实体。In the example utterances, identify the entities you want extracted. 若要预订航班,则需要“目的地”、“日期”、“航空公司”、“机票类别”和“旅行舱位”等信息。To book a flight, you need information like the destination, date, airline, ticket category, and travel class. 为这些数据类型创建实体,然后在示例言语中标记实体Create entities for these data types and then mark the entities in the example utterances. 实体对于实现意向很重要。Entities are important for accomplishing an intent.

确定要在应用中使用哪些实体后,请记住,有不同类型的实体可用于捕获对象类型间的关系。When determining which entities to use in your app, keep in mind that there are different types of entities for capturing relationships between object types. LUIS 中的实体提供有关不同类型的详细信息。Entities in LUIS provides more detail about the different types.

提示

LUIS 提供预生成的实体,用于常见的聊天式用户方案。LUIS offers prebuilt entities for common, conversational user scenarios. 考虑从使用预生成的实体着手,方便应用程序开发。Consider using prebuilt entities as a starting point for your application development.

使用意向或实体进行解析?Resolution with intent or entity?

在许多情况下,尤其是在处理自然聊天时,用户提供的言语可能包含多个功能或意向。In many cases, especially when working with natural conversation, users provide an utterance that can contain more than one function or intent. 若要解决这个问题,通常需要明白:意向和实体中均可表示输出。To address this, a general rule of thumb is to understand that the representation of the output can be done in both intents and entities. 此表示形式应可映射到客户端应用程序操作,不必局限于意向。This representation should be mappable to your client application actions, and it doesn't need to be limited to the intents.

Int-ent-ties 是一种概念,即操作(通常理解为意向)也可以被捕获为实体,并以这种形式依赖于输出 JSON,你可以在其中将其映射到特定操作。Int-ent-ties is the concept that actions (usually understood as intents) could also be captured as entities and relied on in this form in the output JSON where you can map it to a specific action. 求反是利用对意向和实体的这种依赖进行完整提取的常见用法。Negation is a common usage to leverage this reliance on both intent and entity for full extraction.

请考虑以下两个言语,它们在选词方面非常接近,但结果却有所不同:Consider the following two utterances which are very close considering word choice but have different results:

话语Utterance
Please schedule my flight from Cairo to Seattle
Cancel my flight from Cairo to Seattle

与其使用两个单独的意向,不如使用 FlightAction 机器学习实体创建单个意向。Instead of having two separate intents, create a single intent with a FlightAction machine learning entity. 机器学习实体应该针对计划请求和取消请求以及源位置或目标位置提取操作的详细信息。The machine learning entity should extract the details of the action for both a scheduling and a cancelling request as well as either a origin or destination location.

FlightAction 实体会在机器学习实体和子实体的以下伪架构中进行构造:The FlightAction entity would be structured in the following pseudo-schema of machine learning entity and subentities:

  • FlightActionFlightAction
    • 操作Action
    • Origin
    • 目标Destination

请向子实体添加特征,这样有助于提取。To help the extraction add features to the subentities. 你将根据自己期望在用户言语中看到的词汇和自己希望在预测响应中返回的值来选择特征。You will choose your features based on the vocabulary you expect to see in user utterances and the values you want returned in the prediction response.

后续步骤Next steps