CLI (v2) feature store YAML schema
APPLIES TO: Azure CLI ml extension v2 (current)
Note
The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.
YAML syntax
Key | Type | Description | Allowed values | Default value |
---|---|---|---|---|
$schema | string | The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including $schema at the top of your file enables you to invoke schema and resource completions. | ||
name | string | Required. Name of the feature store. | ||
compute_runtime | object | The compute runtime configuration used for materialization job. | ||
compute_runtime.spark_runtime_version | string | The Azure Machine Learning Spark runtime version. | 3.4 | 3.4 |
offline_store | object | |||
offline_store.type | string | Required if offline_store is provided. The type of offline store. Only data lake gen2 type of storage is supported. | azure_data_lake_gen2 | |
offline_store.target | string | Required if offline_store is provided. The datalake Gen2 storage URI in the format of /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Storage/storageAccounts/<account>/blobServices/default/containers/<container> . |
||
online_store | object | |||
online_store.type | string | Required if online_store is provided. The type of online store. Only redis cache is supported. | redis | |
online_store.target | string | Required if online_store is provided. The Redis Cache URI in the format of /subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Cache/Redis/<redis-name> . |
||
materialization_identity | object | The user-assigned managed identity that used for the materialization job. This identity needs to be granted necessary roles to access Feature Store service, the data source, and the offline storage. | ||
materialization_identity.client_id | string | The client ID for your user-assigned managed identity. | ||
materialization_identity.resource_id | string | The resource ID for your user-assigned managed identity. | ||
materialization_identity.principal_id | string | the principal ID for your user-assigned managed identity. | ||
description | string | Description of the feature store. | ||
tags | object | Dictionary of tags for the feature store. | ||
display_name | string | Display name of the feature store in the studio UI. Can be nonunique within the resource group. | ||
location | string | The location of the feature store. | The resource group location. | |
resource_group | string | The resource group containing the feature store. If the resource group doesn't exist, a new one is created. |
You can include other workspace properties.
Remarks
The az ml feature-store
command can be used for managing Azure Machine Learning feature store workspaces.
Examples
Examples are available in the examples GitHub repository. Some common examples are shown here:
YAML basic
$schema: http://azureml/sdk-2-0/FeatureStore.json
name: mktg-feature-store
location: chinanorth3
YAML with offline store configuration
$schema: http://azureml/sdk-2-0/FeatureStore.json
name: mktg-feature-store
compute_runtime:
spark_runtime_version: 3.2
offline_store:
type: azure_data_lake_gen2
target: /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<account_name>/blobServices/default/containers/<container_name>
materialization_identity:
client_id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
resource_id: /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<uai-name>
# Many of workspace parameters will also be supported:
location: chinanorth3
display_name: marketing feature store
tags:
foo: bar
Configure the online store in the CLI with YAML
$schema: http://azureml/sdk-2-0/FeatureStore.json
name: mktg-feature-store
compute_runtime:
spark_runtime_version: 3.4
online_store:
type: redis
target: "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Cache/Redis/<redis-name>"
materialization_identity:
client_id: 00001111-aaaa-2222-bbbb-3333cccc4444
principal_id: aaaaaaaa-bbbb-cccc-1111-222222222222
resource_id: /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<uai-name>
# Many of workspace parameters will also be supported:
location: eastus
display_name: marketing feature store
tags:
foo: bar
Configure the online store in the CLI with Python
redis_arm_id = f"/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}/providers/Microsoft.Cache/Redis/{redis_name}"
online_store = MaterializationStore(type="redis", target=redis_arm_id)
fs = FeatureStore(
name=featurestore_name,
location=location,
online_store=online_store,
)
# wait for feature store creation
fs_poller = ml_client.feature_stores.begin_create(fs)
# move the feature store to a YAML file
yaml_path = root_dir + "/featurestore/featurestore_with_online.yaml"
fs.dump(yaml_path)