Feature governance and lineage
This page describes the governance and lineage capabilities of feature engineering in Unity Catalog.
Control access to feature tables
Access control for feature tables in Unity Catalog is managed by Unity Catalog. See Unity Catalog privileges.
View feature table, function, and model lineage
When you log a model using FeatureEngineeringClient.log_model
, the features used in the model are automatically tracked and can be viewed in the Lineage tab of Catalog Explorer. In addition to feature tables, Python UDFs that are used to compute on-demand features are also tracked.
How to capture lineage of a feature table, function, or model
Lineage information tracking feature tables and functions used in models is automatically captured when you call log_model
. See the following example code.
from databricks.feature_engineering import FeatureEngineeringClient, FeatureLookup, FeatureFunction
fe = FeatureEngineeringClient()
features = [
FeatureLookup(
table_name = "main.on_demand_demo.restaurant_features",
feature_names = ["latitude", "longitude"],
rename_outputs={"latitude": "restaurant_latitude", "longitude": "restaurant_longitude"},
lookup_key = "restaurant_id",
timestamp_lookup_key = "ts"
),
FeatureFunction(
udf_name="main.on_demand_demo.extract_user_latitude",
output_name="user_latitude",
input_bindings={"blob": "json_blob"},
),
FeatureFunction(
udf_name="main.on_demand_demo.extract_user_longitude",
output_name="user_longitude",
input_bindings={"blob": "json_blob"},
),
FeatureFunction(
udf_name="main.on_demand_demo.haversine_distance",
output_name="distance",
input_bindings={"x1": "restaurant_longitude", "y1": "restaurant_latitude", "x2": "user_longitude", "y2": "user_latitude"},
)
]
training_set = fe.create_training_set(
label_df, feature_lookups=features, label="label", exclude_columns=["restaurant_id", "json_blob", "restaurant_latitude", "restaurant_longitude", "user_latitude", "user_longitude", "ts"]
)
class IsClose(mlflow.pyfunc.PythonModel):
def predict(self, ctx, inp):
return (inp['distance'] < 2.5).values
model_name = "fe_packaged_model"
mlflow.set_registry_uri("databricks-uc")
fe.log_model(
IsClose(),
model_name,
flavor=mlflow.pyfunc,
training_set=training_set,
registered_model_name=registered_model_name
)
View the lineage of a feature table, model, or function
To view the lineage of a feature table, model, or function, follow these steps:
Navigate to the table, model version, or function page in Catalog Explorer.
Select the Lineage tab. The left sidebar shows Unity Catalog components that were logged with this table, model version, or function.
Click See lineage graph. The lineage graph appears. For details about exploring the lineage graph, see Capture and explore lineage.
To close the lineage graph, click in the upper-right corner.