Synthetic data has become an important tool in the fine-tuning of language models to follow instructions and solve complex problems. Nevertheless, the majority of open data to date is often lacking multi-turn data and collected on closed models, limiting progress on advancing open fine-tuning methods. We introduce Self Directed Synthetic Dialogues (SDSD), an experimental dataset consisting of guided conversations of language models talking to themselves. The dataset consists of multi-turn conversations generated with DBRX, Llama 2 70B, and Mistral Large, all instructed to follow a conversation plan generated prior to the conversation. We also explore including principles from Constitutional AI and other related works to create synthetic preference data via revisions to the final conversation turn. We hope this work encourages further exploration in multi-turn data and the use of open models for expanding the impact of synthetic data.
翻译:合成数据已成为微调语言模型以遵循指令和解决复杂问题的重要工具。然而,迄今为止大多数公开数据通常缺乏多轮对话数据,且多基于闭源模型收集,这限制了推进开放式微调方法的进展。我们提出了自导向合成对话数据集,这是一个由语言模型与自身进行引导式对话构成的实验性数据集。该数据集包含使用DBRX、Llama 2 70B和Mistral Large生成的多轮对话,所有模型均被要求遵循对话前生成的对话计划执行。我们还探索融入宪法人工智能及相关工作的原则,通过对最终对话轮次进行修订来创建合成偏好数据。我们希望这项工作能促进对多轮对话数据的进一步探索,并推动使用开源模型以扩展合成数据的影响力。