Specify a managed storage location in Unity Catalog
A managed storage location specifies a location in cloud object storage for storing data for managed tables and managed volumes.
You can associate a managed storage location with a metastore, catalog, or schema. Managed storage locations at lower levels in the hierarchy override storage locations defined at higher levels when managed tables or managed volumes are created.
Metastore-level managed storage is optional, and new workspaces that are enabled for Unity Catalog automatically are created without a metastore-level managed storage location. Azure Databricks recommends that you assign managed storage at the catalog level for logical data isolation, with metastore-level and schema-level as options. However, metastore-level storage is required for some functionality, like sharing notebooks using Delta Sharing or using personal staging locations as an Azure Databricks partner. See Automatic enablement of Unity Catalog,Data governance and data isolation building blocks, and [_]/data-governance/unity-catalog/create-metastore.md).
What is a managed storage location?
Managed storage locations have the following properties:
- Managed tables and managed volumes store data and metadata files in managed storage locations.
- Managed storage locations cannot overlap with external tables or external volumes.
The following table describes how a managed storage location is declared and associated with Unity Catalog objects:
Associated Unity Catalog object | How to set | Relation to external locations |
---|---|---|
Metastore | Configured by account admin during metastore creation. | Cannot overlap an external location. |
Catalog | Specified during catalog creation using the MANAGED LOCATION keyword. |
Must be contained within an external location. |
Schema | Specified during schema creation using the MANAGED LOCATION keyword. |
Must be contained within an external location. |
The managed storage location that stores data and metadata for managed tables and managed volumes uses the following rules:
- If the containing schema has a managed location, the data is stored in the schema managed location.
- If the containing schema does not have a managed location but the catalog has a managed location, the data is stored in the catalog managed location.
- If neither the containing schema nor the containing catalog have a managed location, data is stored in the metastore managed location.
Unity Catalog prevents overlap of location governance. See How do paths work for data managed by Unity Catalog?.
Managed storage location, storage root, and storage location
When you specify a MANAGED LOCATION
for a catalog or schema, the provided location is tracked as the Storage Root in Unity Catalog. To ensure that all managed entities have a unique location, Unity Catalog adds hashed subdirectories to the specified location, using the following format:
Object | Path |
---|---|
Schema | <storage-root>/__unitystorage/schemas/00000000-0000-0000-0000-000000000000 |
Catalog | <storage-root>/__unitystorage/catalogs/00000000-0000-0000-0000-000000000000 |
The fully qualified path for the managed storage location is tracked as the Storage Location in Unity Catalog.
You can specify the same managed storage location for multiple schemas and catalogs.
Required privileges
Users who have the CREATE MANAGED STORAGE
privilege on an external location can configure managed storage locations during catalog or schema creation.
Account admins can add an optional managed storage location at the metastore level.
Set a managed storage location for a metastore
To set a managed storage location for a metastore, see Add managed storage to an existing metastore.
Set a managed storage location for a catalog
Set a managed storage location for a catalog by using the MANAGED LOCATION
keyword during catalog creation, as in the following example:
CREATE CATALOG <catalog-name>
MANAGED LOCATION 'abfss://<container-name>@<storage-account>.dfs.core.chinacloudapi.cn/<path>/<directory>';
You can also use Catalog Explorer to set the managed storage location for a catalog. See Create catalogs.
Set a managed storage location for a schema
Set a managed storage location for a schema by using the MANAGED LOCATION
keyword during schema creation, as in the following example:
CREATE CATALOG <catalog>.<schema-name>
MANAGED LOCATION 'abfss://<container-name>@<storage-account>.dfs.core.chinacloudapi.cn/<path>/<directory>';
You can also use Catalog Explorer to set the managed storage location for a schema. See Create schemas.
Next steps
Manage storage locations are used for creating managed tables and managed volumes. See Work with managed tables and What are Unity Catalog volumes?.