Release notes for Databricks feature engineering and legacy workspace feature store

This page lists releases of the Databricks Feature Engineering in Unity Catalog client and the Databricks Workspace Feature Store client. Both clients are available on PyPI as databricks-feature-engineering.

The libraries are used to:

  • Create, read, and write feature tables.
  • Train models on feature data.
  • Publish feature tables to online stores for real-time serving.

For usage documentation, see Databricks Feature Store. For Python API documentation, see Python API.

The Feature Engineering in Unity Catalog client works for features and feature tables in Unity Catalog. The Workspace Feature Store client works for features and feature tables in Workspace Feature Store. Both clients are pre-installed in Databricks Runtime for Machine Learning. They can also run on Databricks Runtime after installing databricks-feature-engineering from PyPI (pip install databricks-feature-engineering). For unit testing only, both clients can be used locally or in CI/CD environments.

For a table showing client version compatibility with Databricks Runtime and Databricks Runtime ML versions, see Feature Engineering compatibility matrix. Older versions of Databricks Workspace Feature Store client are available on PyPI as databricks-feature-store.

databricks-feature-engineering 0.7.0

databricks-feature-engineering 0.6.0

  • Running point-in-time joins with native Spark is now supported, in addition to existing support with Tempo. Huge thanks to Semyon Sinchenko for suggesting the idea!
  • StructType is now supported as a PySpark data type. StructType is not supported for online serving.
  • write_table now supports writing to tables that have liquid clustering enabled.
  • The timeseries_columns parameter for create_table has been renamed to timeseries_column. Existing workflows can continue to use the timeseries_columns parameter.
  • score_batch now supports the env_manager parameter. See the MLflow documentation for more information.

databricks-feature-engineering 0.5.0

  • New API update_feature_spec in databricks-feature-engineering that allows users to update the owner of a FeatureSpec in Unity Catalog.

databricks-feature-engineering 0.4.0

  • Small bug fixes and improvements.

databricks-feature-engineering 0.3.0

  • log_model now uses the new databricks-feature-lookup PyPI package, which includes performance improvements for online model serving.

databricks-feature-store 0.17.0

  • databricks-feature-store is deprecated. All existing modules in this package are available in databricks-feature-engineering version 0.2.0 and above. For details, see Python API.

databricks-feature-engineering 0.2.0

  • databricks-feature-engineering now contains all modules from databricks-feature-store. For details, see Python API.

databricks-feature-store 0.16.3

  • Fixes timeout bug when using AutoML with feature tables.

databricks-feature-engineering 0.1.3

  • Small improvements in the UpgradeClient.

databricks-feature-store 0.16.2

databricks-feature-store 0.16.1

  • Small bug fixes and improvements.

databricks-feature-engineering 0.1.2 & databricks-feature-store 0.16.0

  • Small bug fixes and improvements.
    • Fixed incorrect job lineage URLs logged with certain workspace setups.

databricks-feature-engineering 0.1.1

  • Small bug fixes and improvements.

databricks-feature-engineering 0.1.0

  • GA release of Feature Engineering in Unity Catalog Python client to PyPI

databricks-feature-store 0.15.1

  • Small bug fixes and improvements.

databricks-feature-store 0.15.0

  • You can now automatically infer and log an input example when you log a model. To do this, set infer_model_example to True when you call log_model. The example is based on the training data specified in the training_set parameter.

databricks-feature-store 0.14.2

  • Fix bug in publishing to Aurora MySQL from MariaDB Connector/J >=2.7.5.

databricks-feature-store 0.14.1

  • Small bug fixes and improvements.

databricks-feature-store 0.14.0

Starting with 0.14.0, you must specify timestamp key columns in the primary_keys argument. Timestamp keys are part of the "primary keys" that uniquely identify each row in the feature table. Like other primary key columns, timestamp key columns cannot contain NULL values.

In the following example, the DataFrame user_features_df contains the following columns: user_id, ts, purchases_30d, and is_free_trial_active.

0.14.0 and above

fs = FeatureStoreClient()

fs.create_table(
name="ads_team.user_features",
primary_keys=["user_id", "ts"],
timestamp_keys="ts",
features_df=user_features_df,
)

0.13.1 and below

fs = FeatureStoreClient()

fs.create_table(
name="ads_team.user_features",
primary_keys="user_id",
timestamp_keys="ts",
features_df=user_features_df,
)

databricks-feature-store 0.13.1

  • Small bug fixes and improvements.

databricks-feature-store 0.13.0

  • The minimum required mlflow-skinny version is now 2.4.0.
  • Creating a training set fails if the provided DataFrame does not contain all required lookup keys.
  • When logging a model that uses feature tables in Unity Catalog, an MLflow signature is automatically logged with the model.

databricks-feature-store 0.12.0

  • You can now delete an online store by using the drop_online_table API.

databricks-feature-store 0.11.0

  • In Unity Catalog-enabled workspaces, you can now publish both workspace and Unity Catalog feature tables to Cosmos DB online stores. This requires Databricks Runtime 13.0 ML or above.

databricks-feature-store 0.10.0

  • Small bug fixes and improvements.

databricks-feature-store 0.9.0

  • Small bug fixes and improvements.

databricks-feature-store 0.8.0

  • Small bug fixes and improvements.

databricks-feature-store 0.7.1

  • Add flask as a dependency to fix missing dependency issue when scoring models with score_batch.

databricks-feature-store 0.7.0

  • Small bug fixes and improvements.

databricks-feature-store 0.6.1

  • Initial public release of the Databricks Feature Store client to PyPI.