AI and machine learning on Databricks

Build, deploy, and manage AI and machine learning applications with Mosaic AI, an integrated platform that unifies the entire AI lifecycle from data preparation to production monitoring.

For a set of tutorials to get you started, see AI and machine learning tutorials.

Build generative AI applications

Develop and deploy enterprise-grade generative AI applications.

Feature	Description
MLflow for GenAI	Measure, improve, and monitor quality throughout the GenAI application lifecycle using AI-powered metrics and comprehensive trace observability.

Train classic machine learning models

Create machine learning models with automated tools and collaborative development environments.

Feature	Description
AutoML	Automatically build high-quality models with minimal code using automated feature engineering and hyperparameter tuning.
Databricks Runtime for ML	Pre-configured clusters with TensorFlow, PyTorch, Keras, and GPU support for deep learning development.
MLflow tracking	Track experiments, compare model performance, and manage the complete model development lifecycle.
Feature engineering	Create, manage, and serve features with automated data pipelines and feature discovery.
Databricks notebooks	Collaborative development environment with support for Python, R, Scala, and SQL for ML workflows.

Train deep learning models

Use built-in frameworks to develop deep learning models.

Feature	Description
Distributed training	Examples of distributed deep learning using Ray, TorchDistributor, and DeepSpeed.
Best practices for deep learning on Databricks	Best practices for deep learning on Databricks.
PyTorch	Single-node and distributed training using PyTorch.
TensorFlow	Single-node and distributed training using TensorFlow and TensorBoard.
Reference solutions	Reference solutions for deep learning.

Deploy and serve models

Deploy models to production with scalable endpoints, real-time inference, and enterprise-grade monitoring.

Monitor and govern ML systems

Ensure model quality, data integrity, and compliance with comprehensive monitoring and governance tools.

Feature	Description
Unity Catalog	Govern data, features, models, and functions with unified access control, lineage tracking, and discovery.
MLflow for Models	Track, evaluate, and monitor generative AI applications throughout the development lifecycle.

Productionize ML workflows

Scale machine learning operations with automated workflows, CI/CD integration, and production-ready pipelines.

Feature	Description
Models in Unity Catalog	Use the model registry in Unity Catalog for centralized governance and to manage the model lifecycle, including deployments.
Lakeflow Jobs	Build automated workflows and production-ready ETL pipelines for ML data processing.
Ray on Databricks	Scale ML workloads with distributed computing for large-scale model training and inference.
MLOps workflows	Implement end-to-end MLOps with automated training, testing, and deployment pipelines.
Git integration	Version control ML code and notebooks with seamless Git integration and collaborative development.

Last updated on 2026-01-26