We introduce an extractive summarization system for meetings that leverages discourse structure to better identify salient information from complex multi-party discussions. Using discourse graphs to represent semantic relations between the contents of utterances in a meeting, we train a GNN-based node classification model to select the most important utterances, which are then combined to create an extractive summary. Experimental results on AMI and ICSI demonstrate that our approach surpasses existing text-based and graph-based extractive summarization systems, as measured by both classification and summarization metrics. Additionally, we conduct ablation studies on discourse structure and relation type to provide insights for future NLP applications leveraging discourse analysis theory.
翻译:我们提出一种面向会议的抽取式摘要系统,该系统通过利用话语结构更好地识别复杂多方讨论中的显著信息。采用话语图表示会议中各话语内容间的语义关系,我们训练基于图神经网络(GNN)的节点分类模型以选取最重要的语句,并将其组合生成抽取式摘要。在AMI和ICSI数据集上的实验结果表明,无论从分类指标还是摘要质量指标来看,我们的方法均优于现有基于文本和基于图的抽取式摘要系统。此外,我们通过对话语结构与关系类型进行消融研究,为未来运用话语分析理论的NLP应用提供见解。