生产监视 API 参考（旧版）

2025-10-30

重要

此功能在 Beta 版中。

注释

Databricks 建议使用已注册的记分器进行生产监视。请参阅记分器生命周期管理 API 参考。

通过生产监控，可以通过在实时流量上自动运行评分器来持续评价您的 GenAI 应用程序的质量。监视服务每 15 分钟运行一次，使用开发中使用的相同评分器评估可配置的跟踪示例。

工作原理

为 MLflow 试验启用生产监视时：

自动执行 - 后台作业每 15 分钟运行一次（初始设置后）
记分器评估 - 每个配置的记分器在生产跟踪示例上运行
反馈附件 - 结果作为反馈附加到每个评估的跟踪
数据存档 - 所有跟踪（而不仅仅是采样跟踪）都会写入 Unity 目录中的 Delta 表进行分析

监视服务可确保使用开发中的相同评分器进行一致的评估，无需手动干预即可提供自动化质量评估。

API 参考文档

`create_external_monitor`

为 Databricks 外部提供的 GenAI 应用程序创建监视器。创建后，监视器将开始根据配置的评估套件自动评估跟踪。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import create_external_monitor

create_external_monitor(
    *,
    catalog_name: str,
    schema_name: str,
    assessments_config: AssessmentsSuiteConfig | dict,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> ExternalMonitor

参数

参数	类型	Description
`catalog_name`	`str`	将在其中创建跟踪存档表的 Unity 目录名称
`schema_name`	`str`	将在其中创建跟踪存档表的 Unity 目录架构名称
`assessments_config`	`AssessmentsSuiteConfig` 或 `dict`	用于在跟踪上运行的评估套件的配置
`experiment_id`	`str` 或 `None`	要与监视器关联的 MLflow 试验的 ID。默认为当前活动试验
`experiment_name`	`str` 或 `None`	要与监视器关联的 MLflow 试验的名称。默认为当前活动试验

退货

ExternalMonitor - 创建的包含试验 ID、配置和监视 URL 的监视对象

Example

import mlflow
from databricks.agents.monitoring import create_external_monitor, AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge

# Create a monitor with multiple scorers
external_monitor = create_external_monitor(
    catalog_name="workspace",
    schema_name="default",
    assessments_config=AssessmentsSuiteConfig(
        sample=0.5,  # Sample 50% of traces
        assessments=[
            BuiltinJudge(name="safety"),
            BuiltinJudge(name="relevance_to_query"),
            BuiltinJudge(name="groundedness", sample_rate=0.2),  # Override sampling for this scorer
            GuidelinesJudge(
                guidelines={
                    "mlflow_only": [
                        "If the request is unrelated to MLflow, the response must refuse to answer."
                    ],
                    "professional_tone": [
                        "The response must maintain a professional and helpful tone."
                    ]
                }
            ),
        ],
    ),
)

print(f"Monitor created for experiment: {external_monitor.experiment_id}")
print(f"View traces at: {external_monitor.monitoring_page_url}")

`get_external_monitor`

检索 Databricks 外部提供的 GenAI 应用程序的现有监视器。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import get_external_monitor

get_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> ExternalMonitor

参数

参数	类型	Description
`experiment_id`	`str` 或 `None`	与监视器关联的 MLflow 试验的 ID
`experiment_name`	`str` 或 `None`	与监视器关联的 MLflow 试验的名称

退货

ExternalMonitor - 检索的监视器对象

提高

ValueError - 提供experiment_id和experiment_name时
NoMonitorFoundError - 找不到给定试验的监视器时

Example

from databricks.agents.monitoring import get_external_monitor

# Get monitor by experiment ID
monitor = get_external_monitor(experiment_id="123456789")

# Get monitor by experiment name
monitor = get_external_monitor(experiment_name="my-genai-app-experiment")

# Access monitor configuration
print(f"Sampling rate: {monitor.assessments_config.sample}")
print(f"Archive table: {monitor.trace_archive_table}")

`update_external_monitor`

更新现有监视器的配置。配置已完全替换为新值（未合并）。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import update_external_monitor

update_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
    assessments_config: AssessmentsSuiteConfig | dict,
) -> ExternalMonitor

参数

参数	类型	Description
`experiment_id`	`str` 或 `None`	与监视器关联的 MLflow 试验的 ID
`experiment_name`	`str` 或 `None`	与监视器关联的 MLflow 试验的名称
`assessments_config`	`AssessmentsSuiteConfig` 或 `dict`	更新的配置将完全替换现有配置

退货

ExternalMonitor - 更新后的监视器对象

提高

ValueError - 未提供assessments_config时

`delete_external_monitor`

删除 Databricks 外部提供的 GenAI 应用程序的监视器。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import delete_external_monitor

delete_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> None

参数

参数	类型	Description
`experiment_id`	`str` 或 `None`	与监视器关联的 MLflow 试验的 ID
`experiment_name`	`str` 或 `None`	与监视器关联的 MLflow 试验的名称

Example

from databricks.agents.monitoring import delete_external_monitor

# Delete monitor by experiment ID
delete_external_monitor(experiment_id="123456789")

# Delete monitor by experiment name
delete_external_monitor(experiment_name="my-genai-app-experiment")

Configuration 类

AssessmentsSuiteConfig

要针对 GenAI 应用程序的跟踪上运行的一套评估的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import AssessmentsSuiteConfig

@dataclasses.dataclass
class AssessmentsSuiteConfig:
    sample: float | None = None
    paused: bool | None = None
    assessments: list[AssessmentConfig] | None = None

特性

Attribute	类型	Description
`sample`	`float` 或 `None`	0.0（独占）和 1.0 之间的全局采样率（含）。单个评估可以替代此情况
`paused`	`bool` 或 `None`	是否暂停监视
`assessments`	`list[AssessmentConfig]` 或 `None`	在跟踪上运行的评估列表

Methods

`from_dict`

从字典表示形式创建 AssessmentsSuiteConfig。

@classmethod
def from_dict(cls, data: dict) -> AssessmentsSuiteConfig

`get_guidelines_judge`

从评估列表中返回第一个 GuidelinesJudge，如果未找到，则返回 None。

def get_guidelines_judge(self) -> GuidelinesJudge | None

Example

from databricks.agents.monitoring import AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge

# Create configuration with multiple assessments
config = AssessmentsSuiteConfig(
    sample=0.3,  # Sample 30% of all traces
    assessments=[
        BuiltinJudge(name="safety"),
        BuiltinJudge(name="relevance_to_query", sample_rate=0.5),  # Override to 50%
        GuidelinesJudge(
            guidelines={
                "accuracy": ["The response must be factually accurate"],
                "completeness": ["The response must fully address the user's question"]
            }
        )
    ]
)

# Create from dictionary
config_dict = {
    "sample": 0.3,
    "assessments": [
        {"name": "safety"},
        {"name": "relevance_to_query", "sample_rate": 0.5}
    ]
}
config = AssessmentsSuiteConfig.from_dict(config_dict)

`BuiltinJudge`

用于在跟踪上运行的内置法官的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import BuiltinJudge

@dataclasses.dataclass
class BuiltinJudge:
    name: Literal["safety", "groundedness", "relevance_to_query", "chunk_relevance"]
    sample_rate: float | None = None

特性

Attribute	类型	Description
`name`	`str`	内置法官的名称。必须是以下项之一： `"safety"`、 `"groundedness"`、 `"relevance_to_query"`、 `"chunk_relevance"`
`sample_rate`	`float` 或 `None`	此特定法官（0.0 到 1.0）的可选替代采样率

可用的内置法官

safety - 检测响应中的有害或有毒内容
groundedness - 评估响应是否在检索的上下文中（RAG 应用程序）
relevance_to_query - 检查响应是否解决了用户的请求
chunk_relevance - 评估每个检索的区块（RAG 应用程序）的相关性

`GuidelinesJudge`

针对准则遵循判断来评估自定义业务规则的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import GuidelinesJudge

@dataclasses.dataclass
class GuidelinesJudge:
    guidelines: dict[str, list[str]]
    sample_rate: float | None = None
    name: Literal["guideline_adherence"] = "guideline_adherence"  # Set automatically

特性

Attribute	类型	Description
`guidelines`	`dict[str, list[str]]`	字典映射指南名称到指南说明列表
`sample_rate`	`float` 或 `None`	此法官的可选替代采样率（0.0 到 1.0）

Example

from databricks.agents.monitoring import GuidelinesJudge

# Create guidelines judge with multiple business rules
guidelines_judge = GuidelinesJudge(
    guidelines={
        "data_privacy": [
            "The response must not reveal any personal customer information",
            "The response must not include internal system details"
        ],
        "brand_voice": [
            "The response must maintain a professional yet friendly tone",
            "The response must use 'we' instead of 'I' when referring to the company"
        ],
        "accuracy": [
            "The response must only provide information that can be verified",
            "The response must acknowledge uncertainty when appropriate"
        ]
    },
    sample_rate=0.8  # Evaluate 80% of traces with these guidelines
)

`ExternalMonitor`

表示 Databricks 外部提供的 GenAI 应用程序的监视器。

@dataclasses.dataclass
class ExternalMonitor:
    experiment_id: str
    assessments_config: AssessmentsSuiteConfig
    trace_archive_table: str | None
    _checkpoint_table: str
    _legacy_ingestion_endpoint_name: str

    @property
    def monitoring_page_url(self) -> str

特性

Attribute	类型	Description
`experiment_id`	`str`	与此监视器关联的 MLflow 试验的 ID
`assessments_config`	`AssessmentsSuiteConfig`	正在运行的评估的配置
`trace_archive_table`	`str` 或 `None`	存档跟踪的 Unity 目录表
`monitoring_page_url`	`str`	在 MLflow UI 中查看监视结果的 URL

后续步骤

设置生产监视 - 启用监视的分步指南
生成评估数据集 - 使用监视结果提高质量
内置评判器引用 - 可用的内置评判器

다음을 통해 공유

生产监视 API 参考（旧版）

工作原理

API 参考文档

create_external_monitor

参数

退货

Example

get_external_monitor

参数

退货

提高

Example

update_external_monitor

参数

退货

提高

delete_external_monitor

参数

Example

Configuration 类

AssessmentsSuiteConfig

特性

Methods

from_dict

get_guidelines_judge

Example

BuiltinJudge

特性

可用的内置法官

GuidelinesJudge

特性

Example

ExternalMonitor

特性

后续步骤

추가 리소스

`create_external_monitor`

`get_external_monitor`

`update_external_monitor`

`delete_external_monitor`

`from_dict`

`get_guidelines_judge`

`BuiltinJudge`

`GuidelinesJudge`

`ExternalMonitor`