Despite substantial progress in abstractive text summarization to generate fluent and informative texts, the factual inconsistency in the generated summaries remains an important yet challenging problem to be solved. In this paper, we construct causal graphs for abstractive text summarization and identify the intrinsic causes of the factual inconsistency, i.e., the language bias and irrelevancy bias, and further propose a debiasing framework, named CoFactSum, to alleviate the causal effects of these biases by counterfactual estimation. Specifically, the proposed CoFactSum provides two counterfactual estimation strategies, i.e., Explicit Counterfactual Masking with an explicit dynamic masking strategy, and Implicit Counterfactual Training with an implicit discriminative cross-attention mechanism. Meanwhile, we design a Debiasing Degree Adjustment mechanism to dynamically adapt the debiasing degree at each decoding step. Extensive experiments on two widely-used summarization datasets demonstrate the effectiveness of CoFactSum in enhancing the factual consistency of generated summaries compared with several baselines.
翻译:尽管抽象式文本摘要生成在产生流畅且信息丰富的文本方面取得了显著进展,但生成摘要中的事实不一致问题仍然是一个重要且具有挑战性的亟待解决难题。本文针对抽象式文本摘要构建了因果图,识别出导致事实不一致的内在原因,即语言偏见和无关性偏见,并进一步提出了一种名为CoFactSum的去偏框架,通过反事实估计来减轻这些偏见的因果效应。具体而言,所提出的CoFactSum提供了两种反事实估计策略:显式反事实掩蔽(采用显式动态掩蔽策略)和隐式反事实训练(采用隐式判别性交叉注意力机制)。同时,我们设计了一种去偏程度调整机制,以在每一步解码中动态适应去偏程度。在两个广泛使用的摘要数据集上进行的大量实验表明,与几种基线方法相比,CoFactSum在提升生成摘要的事实一致性方面具有显著效果。