Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam sizes produces the most faithful summaries while nucleus sampling generates the least faithful ones. We propose two faithfulness-aware generation methods to further improve faithfulness over current generation techniques: (1) ranking candidates generated by beam search using automatic faithfulness metrics and (2) incorporating lookahead heuristics that produce a faithfulness score on the future summary. We show that both generation methods significantly improve faithfulness across two datasets as evaluated by four automatic faithfulness metrics and human evaluation. To reduce computational cost, we demonstrate a simple distillation approach that allows the model to generate faithful summaries with just greedy decoding. Our code is publicly available at https://github.com/amazon-science/faithful-summarization-generation
翻译:尽管在理解和改善抽象摘要的忠实性方面取得了显著进展,但解码策略如何影响忠实性这一问题仍研究不足。我们系统研究了束搜索和核采样等生成技术对抽象摘要忠实性的影响。研究发现一个一致趋势:使用大束宽的束搜索能产生最忠实的摘要,而核采样生成的摘要忠实性最低。我们提出两种面向忠实性的生成方法,以在现有生成技术基础上进一步提升忠实性:(1) 使用自动忠实性指标对束搜索生成的候选结果进行排序;(2) 引入前瞻启发式方法,为未来摘要生成忠实性评分。实验表明,这两种生成方法在四个自动忠实性指标和人工评估中均显著提升了两个数据集的摘要忠实性。为降低计算成本,我们展示了一种简单蒸馏方法,使模型仅通过贪婪解码即可生成忠实摘要。我们的代码已开源在 https://github.com/amazon-science/faithful-summarization-generation