View conversational language understanding model details
Article
After model training is completed, you can view your model details and see how well it performs against the test set.
Note
Using the Automatically split the testing set from training data option may result in different model evaluation result every time you train a new model, as the test set is selected randomly from your utterances. To make sure that the evaulation is calcualted on the same test set every time you train a model, make sure to use the Use a manual split of training and testing data option when starting a training job and define your Testing set when add your utterances.
Select Model performance from the menu on the left side of the screen.
In this page you can only view the successfully trained models, F1 score of each model and model expiration date. You can select the model name for more details about its performance. Models only include evaluation details if there was test data selected while training the model.
In this tab you can view the model's details such as: F1 score, precision, recall, date and time for the training job, total training time and number of training and testing utterances included in this training job. You can view details between intents or entities by selecting Model Type at the top.
You will also see guidance on how to improve the model. When clicking on view details a side panel will open to give more guidance on how to improve the model.
This is a snapshot of how your model performed during testing. The metrics here are static and tied to your model, so they won’t update until you train again.
You can see for each intent or entity the precision, recall, F1 score, number of training and testing labels. Entities that do not include a learned component will show no training labels. A learned component is added only by adding labels in your training set.
Here you will see the utterances included in the test set and their intent or entity predictions. You can use the Show errors only toggle to show only the utterances where there are different predictions from their labels, or unselect the toggle to view all utterances in the test set. You can also toggle the view between Showing entity labels as the view for each utterance, or Showing entity predictions. Entity predictions show as dotted lines and labels show as solid lines.
You can expand each row to view its intent or entity predictions, specified by the Model Type column. The Text column shows the text of the entity that was predicted or labeled. Each row has a Labeled as column to indicate the label in the test set, and Predicted as column to indicate the actual prediction. Also, you will see whether it is a true positive, false positive or false negative in the Result Type column.
This snapshot shows how intents or entities are distributed across your training and testing sets. This data is static and tied to your model, so it won’t update until you train again. Entities that do not include a learned component will show no training labels. A learned component is added only by adding labels in your training set.
A confusion matrix is an N x N matrix used for evaluating the performance of your model, where N is the number of target intents or entities. The matrix compares the expected labels with those predicted by the model to identify which intents or entities are being misclassified as other intents and entities. You can click into any cell of the confusion matrix to identify exactly which utterances contributed to the values in that cell.
You can view the intent confusion matrix in raw count or normalized view. Raw count is the actual number of utterances that have been predicted and labeled for a set of intents. Normalized value is the ratio, between 0 and 1, of the predicted and labeled utterances for a set of intents.
You can view the entity confusion matrix in character overlap count or normalized character overlap view. Character overlap count is the actual number of spans that have been predicted and labeled for a set of entities. Normalized character overlap is the ratio, between 0 and 1, of the predicted and labeled spans for a set of entities. Sometimes entities can be predicted or labeled partially, leading to decimal values in the confusion matrix.
All values: Will show the confusion matrix for all intents or entities.
Only errors: Will show the confusion matrix for intents or entities with errors only.
Only matches: Will show the confusion matrix for intents or entities with correct predictions only.
Create a GET request using the following URL, headers, and JSON body to get the trained model evaluation summary.
Model Summary
This API returns the summary of your model's evaluation results, including the precision, recall, F1, and confusion matrix of your intents and entities.
The name for your project. This value is case-sensitive.
EmailApp
{API-VERSION}
The version of the API you are calling.
2022-10-01-preview
{MODEL-NAME}
The name of your model. This value is case-sensitive.
v1
Headers
Use the following header to authenticate your request.
Key
Value
Ocp-Apim-Subscription-Key
The key to your resource. Used for authenticating your API requests.
Once you send your API request, you will receive a 202 response indicating success. In the response headers, extract the operation-location value. It will be formatted like this:
JOB-ID is used to identify your request, since this operation is asynchronous. Use this URL to get the status of your model data loading, using the same authentication method.
Export model data
Create a POST request using the following URL, headers, and JSON body to export your model data.
Request URL
Use the following URL when creating your API request. Replace the placeholder values below with your own values.
The name for your project. This value is case-sensitive.
EmailApp
{API-VERSION}
The version of the API you are calling.
2022-10-01-preview
{MODEL-NAME}
The name of your model. This value is case-sensitive.
v1
Headers
Use the following header to authenticate your request.
Key
Value
Ocp-Apim-Subscription-Key
The key to your resource. Used for authenticating your API requests.
Once you send your API request, you will receive a 202 response indicating success. In the response headers, extract the operation-location value. It will be formatted like this:
JOB-ID is used to identify your request, since this operation is asynchronous. Use this URL to get the exported project JSON, using the same authentication method.