教程:成批测试数据集Tutorial: Batch test data sets

本教程演示如何使用批处理测试来验证语言理解 (LUIS) 应用的质量。This tutorial demonstrates how to use batch testing to validate the quality of your Language Understanding (LUIS) app.

批测试允许使用一组已知的已标记话语和实体来验证活动的定型模型的状态。Batch testing allows you to validate the active, trained model's state with a known set of labeled utterances and entities. 在 JSON 格式的批处理文件中,添加话语并设置要在话语中预测的所需实体标签。In the JSON-formatted batch file, add the utterances and set the entity labels you need predicted inside the utterance.

批处理测试的要求:Requirements for batch testing:

  • 每个测试的最大话语量为 1000 个。Maximum of 1000 utterances per test.
  • 没有重复项。No duplicates.
  • 允许的实体类型:仅限机器习得的实体。Entity types allowed: only machined-learned entities.

使用本教程以外的应用时,请不要使用已添加到应用的示例言语。When using an app other than this tutorial, do not use the example utterances already added to your app.

本教程介绍如何执行下列操作:In this tutorial, you learn how to:

  • 导入示例应用Import example app
  • 创建批处理测试文件Create a batch test file
  • 运行批处理测试Run a batch test
  • 查看测试结果Review test results

在本文中,可以使用免费 LUIS 帐户来创作 LUIS 应用程序。For this article, you can use the free LUIS account in order to author your LUIS application.

导入示例应用Import example app

导入提取披萨订单的应用,例如 1 pepperoni pizza on thin crustImport an app that takes a pizza order such as 1 pepperoni pizza on thin crust.

  1. 下载并保存应用 JSON 文件Download and save app JSON file.

  2. 登录到 LUIS 门户,选择“订阅”和“创作资源”以查看分配给该创作资源的应用。Sign in to the LUIS portal, and select your Subscription and Authoring resource to see the apps assigned to that authoring resource.

  3. 将 JSON 导入到新应用中,并将应用命名为 Pizza appImport the JSON into a new app, name the app Pizza app.

  4. 选择导航栏右上角的“训练”以训练该应用。Select Train in the top-right corner of the navigation to train the app.

批处理文件言语应包含的内容What should the batch file utterances include

批处理文件中的言语应包含机器学习的带有标签的顶级实体(包括开始和结束位置)。The batch file should include utterances with top-level machine-learning entities labeled including start and end position. 言语不应是已包含在应用中的示例的一部分。The utterances should not be part of the examples already in the app. 它们应该是你要在其中积极预测意向和实体的言语。They should be utterances you want to positively predict for intent and entities.

可以按意向和/或实体划分测试,或者将所有测试(最多 1000 个言语)包含在同一文件中。You can separate out tests by intent and/or entity or have all the tests (up to 1000 utterances) in the same file.

批处理文件Batch file

示例 JSON 包含一个言语(该言语包含一个带标签的实体)用于演示测试文件的外观。The example JSON includes one utterance with a labeled entity to illustrate what a test file looks like. 你自己的测试中应该包含多个言语,这些言语标记了正确的意向和机器学习实体。In your own tests, you should have many utterances with correct intent and machine-learning entity labeled.

  1. 在文本编辑器中创建 pizza-with-machine-learned-entity-test.json下载它。Create pizza-with-machine-learned-entity-test.json in a text editor or download it.

  2. 在 JSON 格式的批处理文件中,添加要在测试中预测的言语和 意向In the JSON-formatted batch file, add an utterance with the Intent you want predicted in the test.

[
    {
        "text": "I want to pick up 1 cheese pizza",
        "intent": "ModifyOrder",
        "entities": [
            {
                "entity": "Order",
                "startPos": 18,
                "endPos": 31
            },
            {
                "entity": "ToppingList",
                "startPos": 20,
                "endPos": 25
            }
        ]
    }
]

运行批处理Run the batch

  1. 选择顶部导航栏的“测试”。Select Test in the top navigation bar.

  2. 选择右侧面板中的“批处理测试面板”。Select Batch testing panel in the right-side panel.

  3. 选择“导入数据集”。Select Import dataset .

    LUIS 应用的屏幕截图,其中突出显示了“导入数据集”Screenshot of LUIS app with Import dataset highlighted

  4. 选择 pizza-with-machine-learned-entity-test.json 文件的文件位置。Choose the file location of the pizza-with-machine-learned-entity-test.json file.

  5. 命名数据集 pizza test,然后选择“完成”。Name the dataset pizza test and select Done .

    选择文件Select file

  6. 选择“运行”按钮。Select the Run button.

  7. 选择“查看结果”。Select See results .

  8. 查看图和图例中的结果。Review results in the graph and legend.

查看意向的批处理结果Review batch results for intents

测试结果以图形显示如何针对活动版本预测测试言语。The test results show graphically how the test utterances were predicted against the active version.

批处理图表将结果显示在四个象限中。The batch chart displays four quadrants of results. 在图表右侧是一个筛选器。To the right of the chart is a filter. 筛选器包含意向和实体。The filter contains intents and entities. 选择图表的一个部分或图表中的一个点时,关联的话语显示在图表下方。When you select a section of the chart or a point within the chart, the associated utterance(s) display below the chart.

鼠标悬停在图表上时,鼠标滚轮可以放大或缩小图表中的显示。While hovering over the chart, a mouse wheel can enlarge or reduce the display in the chart. 当图表上有许多点紧密地聚集在一起时,这是非常有用的。This is useful when there are many points on the chart clustered tightly together.

图表分为四个象限,其中两个部分以红色显示。The chart is in four quadrants, with two of the sections displayed in red.

  1. 在筛选器列表中选择“ModifyOrder”意向。Select the ModifyOrder intent in the filter list.

    在筛选器列表中选择 ModifyOrder 意向Select ModifyOrder intent from filter list

    言语预测为“漏报”,这意味着,该言语已成功匹配其在批处理文件中列出的正面预测结果。The utterance is predicted as a True Positive meaning the utterance successfully matched its positive prediction listed in the batch file.

    言语已成功匹配其正面预测结果Utterance successfully matched its positive prediction

    筛选器列表中的绿色勾选标记也指示每个意向的测试成功。The green checkmarks in the filters list also indicate the success of the test for each intent. 所有其他意向列出了 1/1 正面评分,因为言语是针对每个意向测试的,而任何意向的负面测试不会列在批处理测试中。All the other intents are listed with a 1/1 positive score because the utterance was tested against each intent, as a negative test for any intents not listed in the batch test.

  2. 选择“Confirmation”意向。Select the Confirmation intent. 此意向未在批处理测试中列出,因此,这是批处理测试中列出的言语的负面测试。This intent isn't listed in the batch test so this is a negative test of the utterance that is listed in the batch test.

    针对批处理文件中未列出的意向成功负面预测了言语Utterance successfully predicted negative for unlisted intent in batch file

    根据筛选器和网格中的绿色文本所示,负面测试成功。The negative test was successful, as noted with the green text in the filter, and the grid.

查看实体的批处理测试结果Review batch test results for entities

ModifyOrder 实体(包含子实体的机器实体)显示是否匹配了顶级实体,并显示如何预测子实体。The ModifyOrder entity, as a machine entity with subentities, displays if the top-level entity matched and display how the subentities are predicted.

  1. 在筛选器列表中选择“ModifyOrder”实体,然后选择网格中的圆圈。Select the ModifyOrder entity in the filter list then select the circle in the grid.

  2. 实体预测结果显示在图表下方。The entity prediction displays below the chart. 显示的内容包括符合预期的预测对应的实线,以及不符合预期的预测对应的虚线。The display includes solid lines for predictions that match the expectation and dotted lines for predictions that don't match the expectation.

    已成功预测批处理文件中的实体父级Entity parent successfully predicted in batch file

使用批处理测试查找错误Finding errors with a batch test

本教程演示了如何运行测试并解释结果。This tutorial showed you how to run a test and interpret results. 其中不会介绍测试理念或者如何对失败的测试做出响应。It didn't cover test philosophy or how to respond to failing tests.

  • 请确保在测试中同时涵盖正面和负面言语,包括可在其中预测不同但相关意向的言语。Make sure to cover both positive and negative utterances in your test, including utterances that may be predicted for a different but related intent.
  • 对于失败的言语,请执行以下任务,然后再次运行测试:For failing utterances, perform the following tasks then run the tests again:
    • 查看当前示例中的意向和实体,并验证活动版本的示例言语是否正确,可用于标记意向和实体。Review current examples for intents and entities, validate the example utterances of the active version are correct both for intent and entity labeling.
    • 添加可帮助应用预测意向和实体的特征Add features that help your app predict intents and entities
    • 添加更多正面示例言语Add more positive example utterances
    • 查看剩余各个意向的示例言语Review balance of example utterances across intents

清理资源Clean up resources

不再需要 LUIS 应用时,请将其删除。When no longer needed, delete the LUIS app. 为此,请从左上角的菜单中选择“我的应用” 。To do so, select My apps from the menu at the top left. 在应用列表中选择该应用名称,然后从上下文菜单中选择“删除” 。Select the app name in the app list, then select Delete from the context menu. 在弹出对话框“删除应用” 中,选择“确定” 。On the pop-up dialog Delete app , select Ok .

后续步骤Next step

本教程使用了批处理测试来验证当前模型。The tutorial used a batch test to validate the current model.