Large language models (LLMs) can generate intermediate reasoning steps. To elicit the reliable reasoning, the common practice is to employ few-shot chain-of-thought prompting, where several in-context demonstrations for reasoning are prepended to the question. However, such chain-of-thought examples are expensive to craft, especially for professional domains, and can have high variance depending on human annotators. Therefore, this work investigates whether LLMs can teach themselves to reason without human-crafted demonstrations. We propose SELF-EXPLAIN to generate CoT examples by LLMs inspired by "encoding specificity" in human memory retrieval. We find using self-explanations makes LLMs more confident, more calibrated and less biased when answering complex questions. Moreover, we find prompting with self-explanations can even significantly outperform using human-crafted CoTs on several complex question answering dataset.
翻译:大语言模型(LLMs)能够生成中间推理步骤。为激发可靠的推理能力,常见做法是采用少样本思维链(chain-of-thought)提示,即在问题前附加若干用于推理的上下文示例。然而,这类思维链示例的制作成本高昂,尤其在专业领域,且其质量易因人工标注者的不同而产生较大差异。因此,本研究探究大语言模型是否能在无需人工编写示例的情况下实现自主推理。受人类记忆检索中“编码特异性”理论的启发,我们提出SELF-EXPLAIN方法,让大语言模型自主生成思维链示例。研究发现,使用自解释方法能使大语言模型在回答复杂问题时更自信、校准度更高且偏差更小。此外,在多个复杂问答数据集上,使用自解释提示的效果甚至显著优于人工编写的思维链提示。