Natural language explanations (NLEs) are vital for elucidating the reasoning behind large language model (LLM) decisions. Many techniques have been developed to generate NLEs using LLMs. However, like humans, LLMs might not always produce optimal NLEs on first attempt. Inspired by human learning processes, we introduce Cross-Refine, which employs role modeling by deploying two LLMs as generator and critic, respectively. The generator outputs a first NLE and then refines this initial explanation using feedback and suggestions provided by the critic. Cross-Refine does not require any supervised training data or additional training. We validate Cross-Refine across three NLP tasks using three state-of-the-art open-source LLMs through automatic and human evaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which only utilizes self-feedback to refine the explanations. Our findings from automatic evaluation and a user study indicate that Cross-Refine outperforms Self-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful LLMs, whereas Self-Refine only yields strong results with ChatGPT. Additionally, we conduct an ablation study to assess the importance of feedback and suggestions. Both of them play an important role in refining explanations. We further evaluate Cross-Refine on a bilingual dataset in English and German.
翻译:自然语言解释(NLEs)对于阐明大型语言模型(LLM)决策背后的推理至关重要。目前已开发出多种利用LLM生成NLE的技术。然而,与人类类似,LLM在首次尝试时未必总能生成最优的NLE。受人类学习过程的启发,我们提出了Cross-Refine方法,该方法通过角色建模,部署两个LLM分别作为生成器和批评器。生成器首先生成初始NLE,随后利用批评器提供的反馈与建议对该解释进行优化。Cross-Refine无需任何监督训练数据或额外训练。我们通过自动评估和人工评估,在三个NLP任务上使用三种最先进的开源LLM验证了Cross-Refine的有效性。我们选择Self-Refine(Madaan等人,2023)作为基线方法,该方法仅利用自我反馈来优化解释。自动评估和用户研究的结果表明,Cross-Refine的性能优于Self-Refine。同时,Cross-Refine在使用能力较弱的LLM时仍能有效工作,而Self-Refine仅在ChatGPT上能取得良好效果。此外,我们通过消融实验评估了反馈与建议的重要性,两者均在优化解释过程中发挥关键作用。我们进一步在英德双语数据集上对Cross-Refine进行了评估。