使用自定义评分器优化提示

此笔记本将指导你了解如何使用 MLflow make_judge创建自定义记分器。

通常,内置评分器和法官不适合所有用例。 利用自定义评分机制或评委,确保您拥有精确的评估机制,以帮助优化决策。

笔记本将引导你使用一个优化提示,以更符合 Markdown 格式的方式输出内容。

%pip install --upgrade mlflow databricks-sdk dspy openai
dbutils.library.restartPython()

使用 MLflow make_judge

MLflow 的最新版本的 make_judge允许你使任何法官迎合你的特定用例。

from mlflow.genai.judges import make_judge

# Create a scorer for customer support quality
markdown_output_judge = make_judge(
    name="markdown_quality",
    instructions=(
        "Evaluate if the answer in {{ outputs }} follows a markdown formatting and accurately answers the question in {{ inputs }} and matches {{ expectations }}. Rate as high, medium or low quality"
    ),
    model="databricks:/databricks-claude-sonnet-4-5"
)

用于映射反馈的目标函数

法官提供的反馈需要映射到优化器可以使用的数值。 优化器还包含来自法官的反馈。

你需要一个函数来提供此映射回优化器。

def feedback_to_score(scores: dict) -> float:
    """Convert feedback values to numerical scores."""
    feedback_value = scores["markdown_quality"]

    # Map categorical feedback to numerical values
    feedback_mapping = {
        "high": 1.0,
        "medium": 0.5,
        "low": 0.0
    }

    # Handle Feedback objects by accessing .value attribute
    if hasattr(feedback_value, 'value'):
        feedback_str = str(feedback_value.value).lower()
    else:
        feedback_str = str(feedback_value).lower()

    return feedback_mapping.get(feedback_str, 0.0)

测试模型

可以按原样测试此模型。 在以下示例中,模型不以 Markdown 格式输出。

import mlflow
import openai
from mlflow.genai.optimize import GepaPromptOptimizer
from databricks_openai import DatabricksOpenAI

# Change this to your workspace catalog and schema
catalog = ""
schema = ""
prompt_location = f"{catalog}.{schema}.markdown"

openai_client = DatabricksOpenAI()

# Register initial prompt
prompt = mlflow.genai.register_prompt(
    name=prompt_location,
    template="Answer this question: {{question}}",
)

# Define your prediction function
def predict_fn(question: str) -> str:
    prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/1")
    completion = openai_client.chat.completions.create(
        model="databricks-gpt-oss-20b",
        messages=[{"role": "user", "content": prompt.format(question=question)}],
    )
    return completion.choices[0].message.content
from IPython.display import Markdown

output = predict_fn("What is the capital of France?")

Markdown(output[1]['text'])

运行优化器

提供了一些示例数据。

# Training data with inputs and expected outputs
dataset = [
    {
        # The inputs schema should match with the input arguments of the prediction function.
        "inputs": {"question": "What is the capital of France?"},
        "expectations": {"expected_response": """## Paris - Capital of France

**Paris** is the capital and largest city of France, located in the *north-central* region.

### Key Facts:
- **Population**: ~2.2 million (city), ~12 million (metro area)
- **Founded**: 3rd century BC
- **Nickname**: *"City of Light"* (La Ville Lumière)

### Notable Landmarks:
1. **Eiffel Tower** - Iconic iron lattice tower
2. **Louvre Museum** - World's largest art museum
3. **Notre-Dame Cathedral** - Gothic masterpiece
4. **Arc de Triomphe** - Monument honoring French soldiers

> Paris is not only the political center but also a global hub for art, fashion, and culture."""},
    },
    {
        "inputs": {"question": "What is the capital of Germany?"},
        "expectations": {"expected_response": """## Berlin - Capital of Germany

**Berlin** is Germany's capital and largest city, situated in the *northeastern* part of the country.

### Historical Significance:
| Period | Importance |
|--------|------------|
| 1961-1989 | Divided by the **Berlin Wall** |
| 1990 | Reunification capital |
| Present | Political & cultural center |

### Must-See Attractions:
1. **Brandenburg Gate** - Neoclassical monument
2. **Reichstag Building** - Seat of German Parliament
3. **Museum Island** - UNESCO World Heritage site
4. **East Side Gallery** - Open-air gallery on Berlin Wall remnants

> *"Ich bin ein Berliner"* - Famous quote by JFK highlighting Berlin's symbolic importance during the Cold War."""},
    },
    {
        "inputs": {"question": "What is the capital of Japan?"},
        "expectations": {"expected_response": """## Tokyo (東京) - Capital of Japan

**Tokyo** is the capital of Japan and the world's most populous metropolitan area, located on the *eastern coast* of Honshu island.

### Demographics & Economy:
- **Population**: ~14 million (city), ~37 million (Greater Tokyo Area)
- **GDP**: One of the world's largest urban economies
- **Status**: Global financial hub and technology center

### Districts & Landmarks:
1. **Shibuya** - Famous crossing and youth culture
2. **Shinjuku** - Business district with Tokyo Metropolitan Government Building
3. **Asakusa** - Historic area with *Sensō-ji Temple*
4. **Akihabara** - Electronics and anime culture hub

### Cultural Blend:
- Ancient temples ⛩️ alongside futuristic skyscrapers 🏙️
- Traditional tea ceremonies 🍵 and cutting-edge technology 🤖

> Tokyo seamlessly combines **centuries-old traditions** with *ultra-modern innovation*, making it a unique global metropolis."""},
    },
    {
        "inputs": {"question": "What is the capital of Italy?"},
        "expectations": {"expected_response": """## Rome (Roma) - The Eternal City

**Rome** is the capital of Italy, famously known as *"The Eternal City"* (*La Città Eterna*), with over **2,750 years** of history.

### Historical Timeline:

753 BC → Founded (according to legend)
27 BC → Capital of Roman Empire
1871 → Capital of unified Italy
Present → Modern capital with ancient roots

### UNESCO World Heritage Sites:
1. **The Colosseum** - Ancient amphitheater (80 AD)
2. **Roman Forum** - Center of ancient Roman life
3. **Pantheon** - Best-preserved ancient Roman building
4. **Vatican City** - Independent city-state within Rome
   - *St. Peter's Basilica*
   - *Sistine Chapel* (Michelangelo's ceiling)

### Famous Quote:
> *"All roads lead to Rome"* - Ancient proverb reflecting Rome's historical importance as the center of the Roman Empire

### Cultural Significance:
- Birthplace of **Western civilization**
- Center of the *Catholic Church*
- Home to countless masterpieces of ***Renaissance art and architecture***"""},
    },
]

# Optimize the prompt
result = mlflow.genai.optimize_prompts(
    predict_fn=predict_fn,
    train_data=dataset,
    prompt_uris=[prompt.uri],
    optimizer=GepaPromptOptimizer(reflection_model="databricks:/databricks-claude-sonnet-4-5"),
    scorers=[markdown_output_judge],
    aggregation=feedback_to_score
)

# Use the optimized prompt
optimized_prompt = result.optimized_prompts[0]
print(f"Optimized template: {optimized_prompt.template}")

查看提示

打开 MLflow 试验的链接,并完成以下步骤,让试验中出现提示:

  1. 确保试验类型设置为 GenAI 应用和代理。
  2. 导航到提示选项卡。
  3. 在右上角单击选择架构,然后输入之前设置的相同架构以查看提示。

加载新提示并再次测试

查看提示的外观,并将其加载到预测函数中,以查看模型执行的不同程度。

from IPython.display import Markdown
prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/10")

Markdown(prompt.template)
from IPython.display import Markdown

def predict_fn(question: str) -> str:
    prompt = mlflow.genai.load_prompt(f"prompts:/{prompt_location}/10")
    completion = openai_client.chat.completions.create(
        model="databricks-gpt-oss-20b",
        # load prompt template using PromptVersion.format()
        messages=[{"role": "user", "content": prompt.format(question=question)}],
    )
    return completion.choices[0].message.content

output = predict_fn("What is the capital of France?.")

Markdown(output[1]['text'])

示例笔记本

下面是一个可运行的笔记本,它展示了如何使用自定义评分器进行提示优化。

使用自定义评分器进行提示优化

获取笔记本