Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this article
Important
This article provides information on using the Azure Machine Learning SDK v1. The SDK v1 is deprecated as of March 31, 2025 and support for it will end on June 30, 2026. You're able to install and use the SDK v1 until that date.
We recommend that you transition to the SDK v2 before June 30, 2026. For more information on the SDK v2, see What is the Azure Machine Learning Python SDK v2 and the SDK v2 reference.
Deploying an Azure Machine Learning model as a web service creates a REST API endpoint. You can send data to this endpoint and receive the prediction returned by the model. In this document, learn how to create clients for the web service by using C#, Go, Java, and Python.
You create a web service when you deploy a model to your local environment, Azure Container Instances, or Azure Kubernetes Service. You retrieve the URI used to access the web service by using the Azure Machine Learning SDK. If authentication is enabled, you can also use the SDK to get the authentication keys or tokens.
The general workflow for creating a client that uses a machine learning web service is:
- Use the SDK to get the connection information.
- Determine the type of request data used by the model.
- Create an application that calls the web service.
Tip
The examples in this document are manually created without the use of OpenAPI (Swagger) specifications. If you've enabled an OpenAPI specification for your deployment, you can use tools such as swagger-codegen to create client libraries for your service.
Important
Some of the Azure CLI commands in this article use the azure-cli-ml
, or v1, extension for Azure Machine Learning. Support for the v1 extension will end on September 30, 2025. You're able to install and use the v1 extension until that date.
We recommend that you transition to the ml
, or v2, extension before September 30, 2025. For more information on the v2 extension, see Azure Machine Learning CLI extension and Python SDK v2.
Note
Use the Azure Machine Learning SDK to get the web service information. This is a Python SDK. You can use any language to create a client for the service.
The azureml.core.Webservice class provides the information you need to create a client. The following Webservice
properties are useful for creating a client application:
auth_enabled
- If key authentication is enabled,True
; otherwise,False
.token_auth_enabled
- If token authentication is enabled,True
; otherwise,False
.scoring_uri
- The REST API address.swagger_uri
- The address of the OpenAPI specification. This URI is available if you enabled automatic schema generation. For more information, see Deploy models with Azure Machine Learning.
There are several ways to retrieve this information for deployed web services:
APPLIES TO:
Python SDK azureml v1
When you deploy a model, a
Webservice
object is returned with information about the service:service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config) service.wait_for_deployment(show_output = True) print(service.scoring_uri) print(service.swagger_uri)
You can use
Webservice.list
to retrieve a list of deployed web services for models in your workspace. You can add filters to narrow the list of information returned. For more information about what can be filtered on, see the Webservice.list reference documentation.services = Webservice.list(ws) print(services[0].scoring_uri) print(services[0].swagger_uri)
If you know the name of the deployed service, you can create a new instance of
Webservice
, and provide the workspace and service name as parameters. The new object contains information about the deployed service.service = Webservice(workspace=ws, name='myservice') print(service.scoring_uri) print(service.swagger_uri)
The following table shows what these URIs look like:
URI type | Example |
---|---|
Scoring URI | http://104.214.29.152:80/api/v1/service/<service-name>/score |
Swagger URI | http://104.214.29.152/api/v1/service/<service-name>/swagger.json |
Tip
The IP address is different for your deployment. Each AKS cluster has its own IP address that is shared by deployments to that cluster.
If you secured the deployed web service using a TLS/SSL certificate, you can use HTTPS to connect to the service using the scoring or swagger URI. HTTPS helps secure communications between a client and a web service by encrypting communications between the two. Encryption uses Transport Layer Security (TLS). TLS is sometimes still referred to as Secure Sockets Layer (SSL), which was the predecessor of TLS.
Important
Web services deployed by Azure Machine Learning only support TLS version 1.2. When creating a client application, make sure that it supports this version.
For more information, see Use TLS to secure a web service through Azure Machine Learning.
Azure Machine Learning provides two ways to control access to your web services.
Authentication Method | ACI | AKS |
---|---|---|
Key | Disabled by default | Enabled by default |
Token | Not Available | Disabled by default |
When sending a request to a service that is secured with a key or token, use the Authorization header to pass the key or token. The key or token must be formatted as Bearer <key-or-token>
, where <key-or-token>
is your key or token value.
The primary difference between keys and tokens is that keys are static and can be regenerated manually, and tokens need to be refreshed upon expiration. Key-based auth is supported for Azure Container Instance and Azure Kubernetes Service deployed web-services, and token-based auth is only available for Azure Kubernetes Service deployments. For more information on configuring authentication, see Configure authentication for models deployed as web services.
When you enable authentication for a deployment, you automatically create authentication keys.
- Authentication is enabled by default when you're deploying to Azure Kubernetes Service.
- Authentication is disabled by default when you're deploying to Azure Container Instances.
To control authentication, use the auth_enabled
parameter when you're creating or updating a deployment.
If authentication is enabled, you can use the get_keys
method to retrieve a primary and secondary authentication key:
primary, secondary = service.get_keys()
print(primary)
Important
If you need to regenerate a key, use service.regen_key
.
When you enable token authentication for a web service, a user must provide an Azure Machine Learning JWT to the web service to access it.
- Token authentication is disabled by default when you're deploying to Azure Kubernetes Service.
- Token authentication isn't supported when you're deploying to Azure Container Instances.
To control token authentication, use the token_auth_enabled
parameter when you're creating or updating a deployment.
If token authentication is enabled, you can use the get_token
method to retrieve a bearer token and that tokens expiration time:
token, refresh_by = service.get_token()
print(token)
If you have the Azure CLI and the machine learning extension, you can use the following command to get a token:
APPLIES TO:
Azure CLI ml extension v1
az ml service get-access-token -n <service-name>
Important
Currently the only way to retrieve the token is by using the Azure Machine Learning SDK or the Azure CLI machine learning extension.
You'll need to request a new token after the token's refresh_by
time.
The REST API expects the body of the request to be a JSON document with the following structure:
{
"data":
[
<model-specific-data-structure>
]
}
Important
The structure of the data needs to match what the scoring script and model in the service expect. The scoring script might modify the data before passing it to the model.
For information on how to enable support for binary data in your service, see Binary data.
Tip
Enabling support for binary data happens in the score.py file used by the deployed model. From the client, use the HTTP functionality of your programming language. For example, the following snippet sends the contents of a JPG file to a web service:
import requests
# Load image data
data = open('example.jpg', 'rb').read()
# Post raw data to scoring URI
res = request.post(url='<scoring-uri>', data=data, headers={'Content-Type': 'application/> octet-stream'})
For information on enabling CORS support in your service, see Cross-origin resource sharing.
This example demonstrates how to use C# to call a web service:
using System;
using System.Collections.Generic;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using Newtonsoft.Json;
namespace MLWebServiceClient
{
// The data structure expected by the service
internal class InputData
{
[JsonProperty("data")]
// The service used by this example expects an array containing
// one or more arrays of doubles
internal double[,] data;
}
class Program
{
static void Main(string[] args)
{
// Set the scoring URI and authentication key or token
string scoringUri = "<your web service URI>";
string authKey = "<your key or token>";
// Set the data to be sent to the service.
// In this case, we are sending two sets of data to be scored.
InputData payload = new InputData();
payload.data = new double[,] {
{
0.0199132141783263,
0.0506801187398187,
0.104808689473925,
0.0700725447072635,
-0.0359677812752396,
-0.0266789028311707,
-0.0249926566315915,
-0.00259226199818282,
0.00371173823343597,
0.0403433716478807
},
{
-0.0127796318808497,
-0.044641636506989,
0.0606183944448076,
0.0528581912385822,
0.0479653430750293,
0.0293746718291555,
-0.0176293810234174,
0.0343088588777263,
0.0702112981933102,
0.00720651632920303
}
};
// Create the HTTP client
HttpClient client = new HttpClient();
// Set the auth header. Only needed if the web service requires authentication.
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", authKey);
// Make the request
try {
var request = new HttpRequestMessage(HttpMethod.Post, new Uri(scoringUri));
request.Content = new StringContent(JsonConvert.SerializeObject(payload));
request.Content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
var response = client.SendAsync(request).Result;
// Display the response from the web service
Console.WriteLine(response.Content.ReadAsStringAsync().Result);
}
catch (Exception e)
{
Console.Out.WriteLine(e.Message);
}
}
}
}
The results returned are similar to the following JSON document:
[217.67978776218715, 224.78937091757172]
This example demonstrates how to use Python to call a web service:
import requests
import json
# URL for the web service
scoring_uri = '<your web service URI>'
# If the service is authenticated, set the key or token
key = '<your key or token>'
# Two sets of data to score, so we get two results back
data = {"data":
[
[
0.0199132141783263,
0.0506801187398187,
0.104808689473925,
0.0700725447072635,
-0.0359677812752396,
-0.0266789028311707,
-0.0249926566315915,
-0.00259226199818282,
0.00371173823343597,
0.0403433716478807
],
[
-0.0127796318808497,
-0.044641636506989,
0.0606183944448076,
0.0528581912385822,
0.0479653430750293,
0.0293746718291555,
-0.0176293810234174,
0.0343088588777263,
0.0702112981933102,
0.00720651632920303]
]
}
# Convert to JSON string
input_data = json.dumps(data)
# Set the content type
headers = {'Content-Type': 'application/json'}
# If authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {key}'
# Make the request and display the response
resp = requests.post(scoring_uri, input_data, headers=headers)
print(resp.text)
The results returned are similar to the following JSON document:
[217.67978776218715, 224.78937091757172]
If you used automatic schema generation with your deployment, you can get the address of the OpenAPI specification for the service by using the swagger_uri property. (For example, print(service.swagger_uri)
.) Use a GET request or open the URI in a browser to retrieve the specification.
The following JSON document is an example of a schema (OpenAPI specification) generated for a deployment:
{
"swagger": "2.0",
"info": {
"title": "myservice",
"description": "API specification for Azure Machine Learning myservice",
"version": "1.0"
},
"schemes": [
"https"
],
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"securityDefinitions": {
"Bearer": {
"type": "apiKey",
"name": "Authorization",
"in": "header",
"description": "For example: Bearer abc123"
}
},
"paths": {
"/": {
"get": {
"operationId": "ServiceHealthCheck",
"description": "Simple health check endpoint to ensure the service is up at any given point.",
"responses": {
"200": {
"description": "If service is up and running, this response will be returned with the content 'Healthy'",
"schema": {
"type": "string"
},
"examples": {
"application/json": "Healthy"
}
},
"default": {
"description": "The service failed to execute due to an error.",
"schema": {
"$ref": "#/definitions/ErrorResponse"
}
}
}
}
},
"/score": {
"post": {
"operationId": "RunMLService",
"description": "Run web service's model and get the prediction output",
"security": [
{
"Bearer": []
}
],
"parameters": [
{
"name": "serviceInputPayload",
"in": "body",
"description": "The input payload for executing the real-time machine learning service.",
"schema": {
"$ref": "#/definitions/ServiceInput"
}
}
],
"responses": {
"200": {
"description": "The service processed the input correctly and provided a result prediction, if applicable.",
"schema": {
"$ref": "#/definitions/ServiceOutput"
}
},
"default": {
"description": "The service failed to execute due to an error.",
"schema": {
"$ref": "#/definitions/ErrorResponse"
}
}
}
}
}
},
"definitions": {
"ServiceInput": {
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"type": "array",
"items": {
"type": "integer",
"format": "int64"
}
}
}
},
"example": {
"data": [
[ 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 ]
]
}
},
"ServiceOutput": {
"type": "array",
"items": {
"type": "number",
"format": "double"
},
"example": [
3726.995
]
},
"ErrorResponse": {
"type": "object",
"properties": {
"status_code": {
"type": "integer",
"format": "int32"
},
"message": {
"type": "string"
}
}
}
}
}
For more information, see OpenAPI specification.
For a utility that can create client libraries from the specification, see swagger-codegen.
Tip
You can retrieve the schema JSON document after you deploy the service. Use the swagger_uri property from the deployed web service (for example, service.swagger_uri
) to get the URI to the local web service's Swagger file.
To view a reference architecture for real-time scoring of Python and deep learning models, go to the Azure architecture center.