Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page shows how to deploy instrumented GenAI applications on Databricks so that production traces are captured automatically.
Traces are logged to an MLflow experiment for real-time viewing. Optionally, store them using Delta tables for long-term storage. See Deploy a traced app to compare deployment and trace logging options.
For apps deployed outside Databricks, see Trace agents deployed outside of Databricks.

Deploy with Agent Framework (recommended)
Steps for deployment
First, set up the storage location(s) for traces:
If you plan to use Production Monitoring to store traces in Delta tables, then ensure it is enabled for your workspace.
Create an MLflow Experiment for storing your app's production traces.
Next, in your Python notebook, instrument your agent with MLflow Tracing, and use Agent Framework to deploy your agent:
Install
mlflow[databricks]in your Python environment. Use the latest version.Connect to the MLflow Experiment using
mlflow.set_experiment(...).Wrap your agent's code using Agent Framework's authoring interfaces . In your agent code, enable MLflow Tracing using automatic or manual instrumentation.
Log your agent as an MLflow model, and register it to Unity Catalog.
Ensure that
mlflowis in the model's Python dependencies, with the same package version used in your notebook environment.Use
agents.deploy(...)to deploy the Unity Catalog model (agent) to a Model Serving endpoint.Note
If you are deploying an agent from a notebook stored in a Databricks Git folder, MLflow 3 real-time tracing does not work by default.
To enable real-time tracing, set the experiment to a non-Git-associated experiment using
mlflow.set_experiment()before runningagents.deploy().
Traces from your agent now appear in the MLflow experiment in real-time.
Example notebook
This notebook demonstrates the deployment steps above.
Agent Framework and MLflow Tracing notebook
Deploy with custom CPU serving (alternative)
If you can't use Agent Framework, deploy your agent using custom CPU Model Serving instead.
First, set up the storage location(s) for traces:
If you plan to use Production Monitoring to store traces in Delta tables, then ensure it is enabled for your workspace.
Create an MLflow Experiment for storing your app's production traces.
Next, in your Python notebook, instrument your agent with MLflow Tracing, and use the Model Serving UI or APIs to deploy your agent:
Log your agent as an MLflow model with automatic or manual tracing instrumentation.
Deploy the model to CPU serving.
Provision a Service Principal or Personal Access Token (PAT) with
CAN_EDITaccess to the MLflow experiment.In the CPU serving endpoint page, go to "Edit endpoint." For each deployed model to trace, add the following environment variables:
ENABLE_MLFLOW_TRACING=trueMLFLOW_EXPERIMENT_ID=<ID of the experiment you created>If you provisioned a Service Principal, set
DATABRICKS_CLIENT_IDandDATABRICKS_CLIENT_SECRET. If you provisioned a PAT, setDATABRICKS_HOSTandDATABRICKS_TOKEN.
View production traces
View production traces in the MLflow Experiments UI. Production traces show:
- User queries and agent responses
- Feedback (thumbs up/down, comments)
- Error rates and failure patterns
- Latency and performance metrics
- Token consumption

Log traces to Delta tables
Optionally, log traces to Delta tables in addition to your MLflow experiment:
- Production monitoring tables (recommended): The job to sync traces to a Delta table runs every ~15 mins. You do not need to enable any monitoring metrics for this to work. Traces do not have size limits.
Limitations
Logging traces to MLflow experiments and to production monitoring tables comes with limits on the number of traces and peak load.
Next steps
- Add context to traces - Attach metadata for request tracking, user sessions, and environment data.
- Track token usage - Monitor token consumption for cost tracking.
- Production monitoring - Automatically evaluate traces with scorers.