Nota:
El acceso a esta página requiere autorización. Puede intentar iniciar sesión o cambiar directorios.
El acceso a esta página requiere autorización. Puede intentar cambiar los directorios.
重要
此功能在 Beta 版中。 工作区管理员可以从 预览 页控制对此功能的访问。 请参阅 管理 Azure Databricks 预览版。
注释
Databricks 建议使用已注册的记分器进行生产监视。 请参阅 记分器生命周期管理 API 参考。
通过生产监控,可以通过在实时流量上自动运行评分器来持续评价您的 GenAI 应用程序的质量。 监视服务每 15 分钟运行一次,使用开发中使用的相同评分器评估可配置的跟踪示例。
工作原理
为 MLflow 试验启用生产监视时:
自动执行 - 后台作业每 15 分钟运行一次(初始设置后)
记分器评估 - 每个配置的记分器在生产跟踪示例上运行
反馈附件 - 结果作为 反馈 附加到每个评估的跟踪
数据存档 - 所有跟踪(而不仅仅是采样跟踪)都会写入 Unity 目录中的 Delta 表进行分析
监视服务可确保使用开发中的相同评分器进行一致的评估,无需手动干预即可提供自动化质量评估。
API 参考文档
create_external_monitor
为 Databricks 外部提供的 GenAI 应用程序创建监视器。 创建后,监视器将开始根据配置的评估套件自动评估跟踪。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import create_external_monitor
create_external_monitor(
*,
catalog_name: str,
schema_name: str,
assessments_config: AssessmentsSuiteConfig | dict,
experiment_id: str | None = None,
experiment_name: str | None = None,
) -> ExternalMonitor
参数
| 参数 | 类型 | Description |
|---|---|---|
catalog_name |
str |
将在其中创建跟踪存档表的 Unity 目录名称 |
schema_name |
str |
将在其中创建跟踪存档表的 Unity 目录架构名称 |
assessments_config |
AssessmentsSuiteConfig 或 dict |
用于在跟踪上运行的评估套件的配置 |
experiment_id |
str 或 None |
要与监视器关联的 MLflow 试验的 ID。 默认为当前活动试验 |
experiment_name |
str 或 None |
要与监视器关联的 MLflow 试验的名称。 默认为当前活动试验 |
退货
ExternalMonitor - 创建的包含试验 ID、配置和监视 URL 的监视对象
Example
import mlflow
from databricks.agents.monitoring import create_external_monitor, AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge
# Create a monitor with multiple scorers
external_monitor = create_external_monitor(
catalog_name="workspace",
schema_name="default",
assessments_config=AssessmentsSuiteConfig(
sample=0.5, # Sample 50% of traces
assessments=[
BuiltinJudge(name="safety"),
BuiltinJudge(name="relevance_to_query"),
BuiltinJudge(name="groundedness", sample_rate=0.2), # Override sampling for this scorer
GuidelinesJudge(
guidelines={
"mlflow_only": [
"If the request is unrelated to MLflow, the response must refuse to answer."
],
"professional_tone": [
"The response must maintain a professional and helpful tone."
]
}
),
],
),
)
print(f"Monitor created for experiment: {external_monitor.experiment_id}")
print(f"View traces at: {external_monitor.monitoring_page_url}")
get_external_monitor
检索 Databricks 外部提供的 GenAI 应用程序的现有监视器。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import get_external_monitor
get_external_monitor(
*,
experiment_id: str | None = None,
experiment_name: str | None = None,
) -> ExternalMonitor
参数
| 参数 | 类型 | Description |
|---|---|---|
experiment_id |
str 或 None |
与监视器关联的 MLflow 试验的 ID |
experiment_name |
str 或 None |
与监视器关联的 MLflow 试验的名称 |
退货
ExternalMonitor - 检索的监视器对象
提高
-
ValueError- 提供experiment_id和experiment_name时 -
NoMonitorFoundError- 找不到给定试验的监视器时
Example
from databricks.agents.monitoring import get_external_monitor
# Get monitor by experiment ID
monitor = get_external_monitor(experiment_id="123456789")
# Get monitor by experiment name
monitor = get_external_monitor(experiment_name="my-genai-app-experiment")
# Access monitor configuration
print(f"Sampling rate: {monitor.assessments_config.sample}")
print(f"Archive table: {monitor.trace_archive_table}")
update_external_monitor
更新现有监视器的配置。 配置已完全替换为新值(未合并)。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import update_external_monitor
update_external_monitor(
*,
experiment_id: str | None = None,
experiment_name: str | None = None,
assessments_config: AssessmentsSuiteConfig | dict,
) -> ExternalMonitor
参数
| 参数 | 类型 | Description |
|---|---|---|
experiment_id |
str 或 None |
与监视器关联的 MLflow 试验的 ID |
experiment_name |
str 或 None |
与监视器关联的 MLflow 试验的名称 |
assessments_config |
AssessmentsSuiteConfig 或 dict |
更新的配置将完全替换现有配置 |
退货
ExternalMonitor - 更新后的监视器对象
提高
-
ValueError- 未提供assessments_config时
delete_external_monitor
删除 Databricks 外部提供的 GenAI 应用程序的监视器。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import delete_external_monitor
delete_external_monitor(
*,
experiment_id: str | None = None,
experiment_name: str | None = None,
) -> None
参数
| 参数 | 类型 | Description |
|---|---|---|
experiment_id |
str 或 None |
与监视器关联的 MLflow 试验的 ID |
experiment_name |
str 或 None |
与监视器关联的 MLflow 试验的名称 |
Example
from databricks.agents.monitoring import delete_external_monitor
# Delete monitor by experiment ID
delete_external_monitor(experiment_id="123456789")
# Delete monitor by experiment name
delete_external_monitor(experiment_name="my-genai-app-experiment")
Configuration 类
AssessmentsSuiteConfig
要针对 GenAI 应用程序的跟踪上运行的一套评估的配置。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import AssessmentsSuiteConfig
@dataclasses.dataclass
class AssessmentsSuiteConfig:
sample: float | None = None
paused: bool | None = None
assessments: list[AssessmentConfig] | None = None
特性
| Attribute | 类型 | Description |
|---|---|---|
sample |
float 或 None |
0.0(独占)和 1.0 之间的全局采样率(含)。 单个评估可以替代此情况 |
paused |
bool 或 None |
是否暂停监视 |
assessments |
list[AssessmentConfig] 或 None |
在跟踪上运行的评估列表 |
Methods
from_dict
从字典表示形式创建 AssessmentsSuiteConfig。
@classmethod
def from_dict(cls, data: dict) -> AssessmentsSuiteConfig
get_guidelines_judge
从评估列表中返回第一个 GuidelinesJudge,如果未找到,则返回 None。
def get_guidelines_judge(self) -> GuidelinesJudge | None
Example
from databricks.agents.monitoring import AssessmentsSuiteConfig, BuiltinJudge, GuidelinesJudge
# Create configuration with multiple assessments
config = AssessmentsSuiteConfig(
sample=0.3, # Sample 30% of all traces
assessments=[
BuiltinJudge(name="safety"),
BuiltinJudge(name="relevance_to_query", sample_rate=0.5), # Override to 50%
GuidelinesJudge(
guidelines={
"accuracy": ["The response must be factually accurate"],
"completeness": ["The response must fully address the user's question"]
}
)
]
)
# Create from dictionary
config_dict = {
"sample": 0.3,
"assessments": [
{"name": "safety"},
{"name": "relevance_to_query", "sample_rate": 0.5}
]
}
config = AssessmentsSuiteConfig.from_dict(config_dict)
BuiltinJudge
用于在跟踪上运行的内置法官的配置。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import BuiltinJudge
@dataclasses.dataclass
class BuiltinJudge:
name: Literal["safety", "groundedness", "relevance_to_query", "chunk_relevance"]
sample_rate: float | None = None
特性
| Attribute | 类型 | Description |
|---|---|---|
name |
str |
内置法官的名称。 必须是以下项之一: "safety"、 "groundedness"、 "relevance_to_query"、 "chunk_relevance" |
sample_rate |
float 或 None |
此特定法官(0.0 到 1.0)的可选替代采样率 |
可用的内置法官
-
safety- 检测响应中的有害或有毒内容 -
groundedness- 评估响应是否在检索的上下文中(RAG 应用程序) -
relevance_to_query- 检查响应是否解决了用户的请求 -
chunk_relevance- 评估每个检索的区块(RAG 应用程序)的相关性
GuidelinesJudge
针对准则遵循判断来评估自定义业务规则的配置。
# These packages are automatically installed with mlflow[databricks]
from databricks.agents.monitoring import GuidelinesJudge
@dataclasses.dataclass
class GuidelinesJudge:
guidelines: dict[str, list[str]]
sample_rate: float | None = None
name: Literal["guideline_adherence"] = "guideline_adherence" # Set automatically
特性
| Attribute | 类型 | Description |
|---|---|---|
guidelines |
dict[str, list[str]] |
字典映射指南名称到指南说明列表 |
sample_rate |
float 或 None |
此法官的可选替代采样率(0.0 到 1.0) |
Example
from databricks.agents.monitoring import GuidelinesJudge
# Create guidelines judge with multiple business rules
guidelines_judge = GuidelinesJudge(
guidelines={
"data_privacy": [
"The response must not reveal any personal customer information",
"The response must not include internal system details"
],
"brand_voice": [
"The response must maintain a professional yet friendly tone",
"The response must use 'we' instead of 'I' when referring to the company"
],
"accuracy": [
"The response must only provide information that can be verified",
"The response must acknowledge uncertainty when appropriate"
]
},
sample_rate=0.8 # Evaluate 80% of traces with these guidelines
)
ExternalMonitor
表示 Databricks 外部提供的 GenAI 应用程序的监视器。
@dataclasses.dataclass
class ExternalMonitor:
experiment_id: str
assessments_config: AssessmentsSuiteConfig
trace_archive_table: str | None
_checkpoint_table: str
_legacy_ingestion_endpoint_name: str
@property
def monitoring_page_url(self) -> str
特性
| Attribute | 类型 | Description |
|---|---|---|
experiment_id |
str |
与此监视器关联的 MLflow 试验的 ID |
assessments_config |
AssessmentsSuiteConfig |
正在运行的评估的配置 |
trace_archive_table |
str 或 None |
存档跟踪的 Unity 目录表 |
monitoring_page_url |
str |
在 MLflow UI 中查看监视结果的 URL |