Explaining the black-box predictions of NLP models naturally and accurately is an important open problem in natural language generation. These free-text explanations are expected to contain sufficient and carefully-selected evidence to form supportive arguments for predictions. Due to the superior generative capacity of large pretrained language models, recent work built on prompt engineering enables explanation generation without specific training. However, explanation generated through single-pass prompting often lacks sufficiency and conciseness. To address this problem, we develop an information bottleneck method EIB to produce refined explanations that are sufficient and concise. Our approach regenerates the free-text explanation by polishing the single-pass output from the pretrained language model but retaining the information that supports the contents being explained. Experiments on two out-of-domain tasks verify the effectiveness of EIB through automatic evaluation and thoroughly-conducted human evaluation.
翻译:解释NLP模型黑箱预测的自然性和准确性是自然语言生成领域一个重要的开放问题。这些自由文本解释应包含充分且经过精心挑选的证据,以形成支持预测的有力论据。由于大型预训练语言模型具有卓越的生成能力,近期基于提示工程的研究工作无需特定训练即可实现解释生成。然而,通过单次提示生成的解释往往缺乏充分性和简洁性。为解决此问题,我们开发了一种信息瓶颈方法EIB,用于生成既充分又简洁的精炼解释。该方法通过打磨预训练语言模型的单次输出结果,同时保留支撑被解释内容的信息,从而再生自由文本解释。在两个跨领域任务上的实验通过自动评估和全面的人工评估验证了EIB的有效性。