Train a custom speech model
In this article, you learn how to train a custom model to improve recognition accuracy from the Microsoft base model. The speech recognition accuracy and quality of a custom speech model remains consistent, even when a new base model is released.
Note
You pay for custom speech model usage and endpoint hosting. You'll also be charged for custom speech model training if the base model was created on October 1, 2023 and later. You are not charged for training if the base model was created prior to October 2023. For more information, see Azure AI Speech pricing and the Charge for adaptation section in the speech to text 3.2 migration guide.
Training a model is typically an iterative process. You first select a base model that is the starting point for a new model. You train a model with datasets that can include text and audio, and then you test. If the recognition quality or accuracy doesn't meet your requirements, you can create a new model with more or modified training data, and then test again.
You can use a custom model for a limited time after it was trained. You must periodically recreate and adapt your custom model from the latest base model to take advantage of the improved accuracy and quality. For more information, see Model and endpoint lifecycle.
You can copy a model to another project that uses the same locale.
Follow these instructions to copy a model to a project in another region:
- Sign in to the Speech Studio.
- Select Custom speech > Your project name > Train custom models.
- Select Copy to.
- On the Copy speech model page, select a target region where you want to copy the model.
- Select a Speech resource in the target region, or create a new Speech resource.
- Select a project where you want to copy the model, or create a new project.
- Select Copy.
After the model is successfully copied, you'll be notified and can view it in the target project.
Copying a model directly to a project in another region isn't supported with the Speech CLI. You can copy a model to a project in another region using the Speech Studio or Speech to text REST API.
To copy a model to another Speech resource, use the Models_Copy operation of the Speech to text REST API. Construct the request body according to the following instructions:
- Set the required
targetSubscriptionKey
property to the key of the destination Speech resource.
Make an HTTP POST request using the URI as shown in the following example. Use the region and URI of the model you want to copy from. Replace YourModelId
with the model ID, replace YourSubscriptionKey
with your Speech resource key, replace YourServiceRegion
with your Speech resource region, and set the request body properties as previously described.
curl -v -X POST -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey" -H "Content-Type: application/json" -d '{
"targetSubscriptionKey": "ModelDestinationSpeechResourceKey"
} ' "https://YourServiceRegion.api.cognitive.azure.cn/speechtotext/v3.2/models/YourModelId:copy"
Note
Only the targetSubscriptionKey
property in the request body has information about the destination Speech resource.
You should receive a response body in the following format:
{
"self": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/models/9df35ddb-edf9-4e91-8d1a-576d09aabdae",
"baseModel": {
"self": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/models/base/eb5450a7-3ca2-461a-b2d7-ddbb3ad96540"
},
"links": {
"manifest": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/models/9df35ddb-edf9-4e91-8d1a-576d09aabdae/manifest",
"copy": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/models/9df35ddb-edf9-4e91-8d1a-576d09aabdae:copy"
},
"properties": {
"deprecationDates": {
"adaptationDateTime": "2023-01-15T00:00:00Z",
"transcriptionDateTime": "2024-07-15T00:00:00Z"
}
},
"lastActionDateTime": "2022-05-22T23:15:27Z",
"status": "NotStarted",
"createdDateTime": "2022-05-22T23:15:27Z",
"locale": "en-US",
"displayName": "My Model",
"description": "My Model Description",
"customProperties": {
"PortalAPIVersion": "3",
"Purpose": "",
"VadKind": "None",
"ModelClass": "None",
"UsesHalide": "False",
"IsDynamicGrammarSupported": "False"
}
}
Connect a model
Models might have been copied from one project using the Speech CLI or REST API, without being connected to another project. Connecting a model is a matter of updating the model with a reference to the project.
If you're prompted in Speech Studio, you can connect them by selecting the Connect button.
To connect a model to a project, use the spx csr model update
command. Construct the request parameters according to the following instructions:
- Set the
project
parameter to the URI of an existing project. This parameter is recommended so that you can also view and manage the model in Speech Studio. You can run thespx csr project list
command to get available projects. - Set the required
modelId
parameter to the ID of the model that you want to connect to the project.
Here's an example Speech CLI command that connects a model to a project:
spx csr model update --api-version v3.2 --model YourModelId --project YourProjectId
You should receive a response body in the following format:
{
"project": {
"self": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/projects/0198f569-cc11-4099-a0e8-9d55bc3d0c52"
},
}
For Speech CLI help with models, run the following command:
spx help csr model
To connect a new model to a project of the Speech resource where the model was copied, use the Models_Update operation of the Speech to text REST API. Construct the request body according to the following instructions:
- Set the required
project
property to the URI of an existing project. This property is recommended so that you can also view and manage the model in Speech Studio. You can make a Projects_List request to get available projects.
Make an HTTP PATCH request using the URI as shown in the following example. Use the URI of the new model. You can get the new model ID from the self
property of the Models_Copy response body. Replace YourSubscriptionKey
with your Speech resource key, replace YourServiceRegion
with your Speech resource region, and set the request body properties as previously described.
curl -v -X PATCH -H "Ocp-Apim-Subscription-Key: YourSubscriptionKey" -H "Content-Type: application/json" -d '{
"project": {
"self": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/projects/0198f569-cc11-4099-a0e8-9d55bc3d0c52"
},
}' "https://YourServiceRegion.api.cognitive.azure.cn/speechtotext/v3.2/models"
You should receive a response body in the following format:
{
"project": {
"self": "https://chinanorth2.api.cognitive.azure.cn/speechtotext/v3.2/projects/0198f569-cc11-4099-a0e8-9d55bc3d0c52"
},
}