# 准备数据集

1.1 确定微调需求 在开始之前，请确保微调是解决问题的正确途径。您应该已经优化了提示，并确定了模型存在的问题。

1.2 创建演示对话 您需要创建一系列演示对话，这些对话应模拟您期望模型在实际使用中的表现。

示例格式： {"messages": \[{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]} 1.3 制作提示 在微调之前，采用您认为最适合模型的指令和提示集，并将它们包含在每个训练示例中以获得最佳结果。

2. 多轮聊天示例 2.1 示例构建 聊天格式的示例可以包含多条具有助理角色的消息。您可以通过调整weight参数来控制学习特定消息的权重。

示例： {"messages": \[{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris", "weight": 0}, {"role": "user", "content": "Can you be more sarcastic?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already.", "weight": 1}]} 3. 微调和测试 3.1 训练和测试拆分 建议将数据集拆分为训练和测试部分，以便在训练后评估模型性能。

3.2 令牌限制 请注意，每个模型的令牌限制不同。例如，对于 GPT-3.5-turbo，每个训练示例限制为 4,096 个令牌。

3.3 估算成本 您可以根据令牌数量和训练周期数来估算微调的成本。

4. 数据格式和上传 4.1 检查数据格式 在创建微调作业之前，请确保数据格式正确。

4.2 上传训练文件 使用以下代码上传您的训练文件：

from afarensis import Afarensis client = Afarensis()

client.files.create( file=open("mydata.jsonl", "rb"), purpose="fine-tune" ) 5. 结论 微调是优化聊天机器人性能的有效方式。通过准备精心设计的演示对话和遵循最佳实践，您可以提升模型的准确性和响应质量。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://doc.afarensis.com/wei-tiao/zhun-bei-shu-ju-ji.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
