Large language models (LLMs) are becoming increasingly important for machine learning applications. However, it can be challenging to align LLMs with our intent, particularly when we want to generate content that is preferable over others or when we want the LLM to respond in a certain style or tone that is hard to describe. To address this challenge, we propose an approach that uses contrastive examples to better describe our intent. This involves providing positive examples that illustrate the true intent, along with negative examples that show what characteristics we want LLMs to avoid. The negative examples can be retrieved from labeled data, written by a human, or generated by the LLM itself. Before generating an answer, we ask the model to analyze the examples to teach itself what to avoid. This reasoning step provides the model with the appropriate articulation of the user's need and guides it towards generting a better answer. We tested our approach on both synthesized and real-world datasets, including StackExchange and Reddit, and found that it significantly improves performance compared to standard few-shot prompting
翻译:大型语言模型在机器学习应用中日益重要。然而,使这些模型与我们的意图保持一致仍具挑战性,尤其是在需要生成优于其他选项的内容,或要求模型以难以描述的特定风格或语气进行回复时。为应对这一挑战,我们提出一种利用对比示例来更精确描述意图的方法。该方法通过提供体现真实意图的正向示例,配合展示需规避特征的负向示例(这些负例可从标注数据中检索、由人类编写,或由语言模型自身生成),在生成回复前引导模型分析示例进行自我学习。这种推理步骤使模型能够准确理解用户需求,并指导其生成更优质的回复。我们在合成数据集及包含StackExchange、Reddit在内的真实世界数据集上进行了测试,结果表明该方法在性能上显著优于标准小样本提示方法。