다음을 통해 공유

生产监视 API 参考(旧版)

重要

此功能在 Beta 版中。 工作区管理员可以从 预览 页控制对此功能的访问。 请参阅 管理 Azure Databricks 预览版

注释

Databricks 建议使用已注册的记分器进行生产监视。 请参阅 记分器生命周期管理 API 参考

通过生产监控,可以通过在实时流量上自动运行评分器来持续评价您的 GenAI 应用程序的质量。 监视服务每 15 分钟运行一次,使用开发中使用的相同评分器评估可配置的跟踪示例。

工作原理

为 MLflow 试验启用生产监视时:

  1. 自动执行 - 后台作业每 15 分钟运行一次(初始设置后)

  2. 记分器评估 - 每个配置的记分器在生产跟踪示例上运行

  3. 反馈附件 - 结果作为 反馈 附加到每个评估的跟踪

  4. 数据存档 - 所有跟踪(而不仅仅是采样跟踪)都会写入 Unity 目录中的 Delta 表进行分析

监视服务可确保使用开发中的相同评分器进行一致的评估,无需手动干预即可提供自动化质量评估。

API 参考文档

create_external_monitor

为 Databricks 外部提供的 GenAI 应用程序创建监视器。 创建后,监视器将开始根据配置的评估套件自动评估跟踪。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import create_external_monitor

create_external_monitor(
    *,
    catalog_name: str,
    schema_name: str,
    assessments_config: AssessmentsSuiteConfig | dict,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> ExternalMonitor

参数

参数 类型 Description
catalog_name str 将在其中创建跟踪存档表的 Unity 目录名称
schema_name str 将在其中创建跟踪存档表的 Unity 目录架构名称
assessments_config AssessmentsSuiteConfigdict 用于在跟踪上运行的评估套件的配置
experiment_id strNone 要与监视器关联的 MLflow 试验的 ID。 默认为当前活动试验
experiment_name strNone 要与监视器关联的 MLflow 试验的名称。 默认为当前活动试验

退货

ExternalMonitor - 创建的包含试验 ID、配置和监视 URL 的监视对象

Example

import mlflow
from databricks.agents.monitoring import create_external_monitor, AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge

# Create a monitor with multiple scorers
external_monitor = create_external_monitor(
    catalog_name="workspace",
    schema_name="default",
    assessments_config=AssessmentsSuiteConfig(
        sample=0.5,  # Sample 50% of traces
        assessments=[
            BuiltinJudge(name="safety"),
            BuiltinJudge(name="relevance_to_query"),
            BuiltinJudge(name="groundedness", sample_rate=0.2),  # Override sampling for this scorer
            GuidelinesJudge(
                guidelines={
                    "mlflow_only": [
                        "If the request is unrelated to MLflow, the response must refuse to answer."
                    ],
                    "professional_tone": [
                        "The response must maintain a professional and helpful tone."
                    ]
                }
            ),
        ],
    ),
)

print(f"Monitor created for experiment: {external_monitor.experiment_id}")
print(f"View traces at: {external_monitor.monitoring_page_url}")

get_external_monitor

检索 Databricks 外部提供的 GenAI 应用程序的现有监视器。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import get_external_monitor

get_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> ExternalMonitor

参数

参数 类型 Description
experiment_id strNone 与监视器关联的 MLflow 试验的 ID
experiment_name strNone 与监视器关联的 MLflow 试验的名称

退货

ExternalMonitor - 检索的监视器对象

提高

  • ValueError - 提供experiment_id和experiment_name时
  • NoMonitorFoundError - 找不到给定试验的监视器时

Example

from databricks.agents.monitoring import get_external_monitor

# Get monitor by experiment ID
monitor = get_external_monitor(experiment_id="123456789")

# Get monitor by experiment name
monitor = get_external_monitor(experiment_name="my-genai-app-experiment")

# Access monitor configuration
print(f"Sampling rate: {monitor.assessments_config.sample}")
print(f"Archive table: {monitor.trace_archive_table}")

update_external_monitor

更新现有监视器的配置。 配置已完全替换为新值(未合并)。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import update_external_monitor

update_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
    assessments_config: AssessmentsSuiteConfig | dict,
) -> ExternalMonitor

参数

参数 类型 Description
experiment_id strNone 与监视器关联的 MLflow 试验的 ID
experiment_name strNone 与监视器关联的 MLflow 试验的名称
assessments_config AssessmentsSuiteConfigdict 更新的配置将完全替换现有配置

退货

ExternalMonitor - 更新后的监视器对象

提高

  • ValueError - 未提供assessments_config时

delete_external_monitor

删除 Databricks 外部提供的 GenAI 应用程序的监视器。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import delete_external_monitor

delete_external_monitor(
    *,
    experiment_id: str | None = None,
    experiment_name: str | None = None,
) -> None

参数

参数 类型 Description
experiment_id strNone 与监视器关联的 MLflow 试验的 ID
experiment_name strNone 与监视器关联的 MLflow 试验的名称

Example

from databricks.agents.monitoring import delete_external_monitor

# Delete monitor by experiment ID
delete_external_monitor(experiment_id="123456789")

# Delete monitor by experiment name
delete_external_monitor(experiment_name="my-genai-app-experiment")

Configuration 类

AssessmentsSuiteConfig

要针对 GenAI 应用程序的跟踪上运行的一套评估的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import AssessmentsSuiteConfig

@dataclasses.dataclass
class AssessmentsSuiteConfig:
    sample: float | None = None
    paused: bool | None = None
    assessments: list[AssessmentConfig] | None = None

特性

Attribute 类型 Description
sample floatNone 0.0(独占)和 1.0 之间的全局采样率(含)。 单个评估可以替代此情况
paused boolNone 是否暂停监视
assessments list[AssessmentConfig]None 在跟踪上运行的评估列表

Methods

from_dict

从字典表示形式创建 AssessmentsSuiteConfig。

@classmethod
def from_dict(cls, data: dict) -> AssessmentsSuiteConfig
get_guidelines_judge

从评估列表中返回第一个 GuidelinesJudge,如果未找到,则返回 None。

def get_guidelines_judge(self) -> GuidelinesJudge | None

Example

from databricks.agents.monitoring import AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge

# Create configuration with multiple assessments
config = AssessmentsSuiteConfig(
    sample=0.3,  # Sample 30% of all traces
    assessments=[
        BuiltinJudge(name="safety"),
        BuiltinJudge(name="relevance_to_query", sample_rate=0.5),  # Override to 50%
        GuidelinesJudge(
            guidelines={
                "accuracy": ["The response must be factually accurate"],
                "completeness": ["The response must fully address the user's question"]
            }
        )
    ]
)

# Create from dictionary
config_dict = {
    "sample": 0.3,
    "assessments": [
        {"name": "safety"},
        {"name": "relevance_to_query", "sample_rate": 0.5}
    ]
}
config = AssessmentsSuiteConfig.from_dict(config_dict)

BuiltinJudge

用于在跟踪上运行的内置法官的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import BuiltinJudge

@dataclasses.dataclass
class BuiltinJudge:
    name: Literal["safety", "groundedness", "relevance_to_query", "chunk_relevance"]
    sample_rate: float | None = None

特性

Attribute 类型 Description
name str 内置法官的名称。 必须是以下项之一: "safety""groundedness""relevance_to_query""chunk_relevance"
sample_rate floatNone 此特定法官(0.0 到 1.0)的可选替代采样率

可用的内置法官

  • safety - 检测响应中的有害或有毒内容
  • groundedness - 评估响应是否在检索的上下文中(RAG 应用程序)
  • relevance_to_query - 检查响应是否解决了用户的请求
  • chunk_relevance - 评估每个检索的区块(RAG 应用程序)的相关性

GuidelinesJudge

针对准则遵循判断来评估自定义业务规则的配置。

# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import GuidelinesJudge

@dataclasses.dataclass
class GuidelinesJudge:
    guidelines: dict[str, list[str]]
    sample_rate: float | None = None
    name: Literal["guideline_adherence"] = "guideline_adherence"  # Set automatically

特性

Attribute 类型 Description
guidelines dict[str, list[str]] 字典映射指南名称到指南说明列表
sample_rate floatNone 此法官的可选替代采样率(0.0 到 1.0)

Example

from databricks.agents.monitoring import GuidelinesJudge

# Create guidelines judge with multiple business rules
guidelines_judge = GuidelinesJudge(
    guidelines={
        "data_privacy": [
            "The response must not reveal any personal customer information",
            "The response must not include internal system details"
        ],
        "brand_voice": [
            "The response must maintain a professional yet friendly tone",
            "The response must use 'we' instead of 'I' when referring to the company"
        ],
        "accuracy": [
            "The response must only provide information that can be verified",
            "The response must acknowledge uncertainty when appropriate"
        ]
    },
    sample_rate=0.8  # Evaluate 80% of traces with these guidelines
)

ExternalMonitor

表示 Databricks 外部提供的 GenAI 应用程序的监视器。

@dataclasses.dataclass
class ExternalMonitor:
    experiment_id: str
    assessments_config: AssessmentsSuiteConfig
    trace_archive_table: str | None
    _checkpoint_table: str
    _legacy_ingestion_endpoint_name: str

    @property
    def monitoring_page_url(self) -> str

特性

Attribute 类型 Description
experiment_id str 与此监视器关联的 MLflow 试验的 ID
assessments_config AssessmentsSuiteConfig 正在运行的评估的配置
trace_archive_table strNone 存档跟踪的 Unity 目录表
monitoring_page_url str 在 MLflow UI 中查看监视结果的 URL

后续步骤