Natural language explanations (NLEs) are vital for elucidating the reasoning behind large language model (LLM) decisions. Many techniques have been developed to generate NLEs using LLMs. However, like humans, LLMs might not always produce optimal NLEs on first attempt. Inspired by human learning processes, we introduce Cross-Refine, which employs role modeling by deploying two LLMs as generator and critic, respectively. The generator outputs a first NLE and then refines this initial explanation using feedback and suggestions provided by the critic. Cross-Refine does not require any supervised training data or additional training. We validate Cross-Refine across three NLP tasks using three state-of-the-art open-source LLMs through automatic and human evaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which only utilizes self-feedback to refine the explanations. Our findings from automatic evaluation and a user study indicate that Cross-Refine outperforms Self-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful LLMs, whereas Self-Refine only yields strong results with ChatGPT. Additionally, we conduct an ablation study to assess the importance of feedback and suggestions. Both of them play an important role in refining explanations. We further evaluate Cross-Refine on a bilingual dataset in English and German.
翻译:自然语言解释(NLEs)对于阐明大语言模型(LLM)决策背后的推理至关重要。目前已发展出多种利用LLM生成NLE的技术。然而,与人类类似,LLM在首次尝试时未必总能生成最优的NLE。受人类学习过程启发,我们提出Cross-Refine方法,通过部署两个LLM分别扮演生成器与批评者的角色建模。生成器首先生成初始NLE,随后依据批评者提供的反馈与建议对该解释进行优化。Cross-Refine无需任何监督训练数据或额外训练。我们通过自动评估与人工评估,在三个NLP任务中使用三种最先进的开源LLM验证了Cross-Refine的有效性。我们选择Self-Refine(Madaan等人,2023)作为基线方法,该方法仅利用自我反馈优化解释。自动评估与用户研究结果表明,Cross-Refine性能优于Self-Refine。同时,Cross-Refine在使用能力较弱的LLM时仍能有效工作,而Self-Refine仅在ChatGPT上表现良好。此外,我们通过消融实验评估了反馈与建议的重要性,两者均在解释优化过程中发挥关键作用。我们进一步在英德双语数据集上对Cross-Refine进行了评估。