Mind-map generation aims to process a document into a hierarchical structure to show its central idea and branches. Such a manner is more conducive to understanding the logic and semantics of the document than plain text. Recently, a state-of-the-art method encodes the sentences of a document sequentially and converts them to a relation graph via sequence-to-graph. Though this method is efficient to generate mind-maps in parallel, its mechanism focuses more on sequential features while hardly capturing structural information. Moreover, it's difficult to model long-range semantic relations. In this work, we propose a coreference-guided mind-map generation network (CMGN) to incorporate external structure knowledge. Specifically, we construct a coreference graph based on the coreference semantic relationship to introduce the graph structure information. Then we employ a coreference graph encoder to mine the potential governing relations between sentences. In order to exclude noise and better utilize the information of the coreference graph, we adopt a graph enhancement module in a contrastive learning manner. Experimental results demonstrate that our model outperforms all the existing methods. The case study further proves that our model can more accurately and concisely reveal the structure and semantics of a document. Code and data are available at https://github.com/Cyno2232/CMGN.
翻译:思维导图生成旨在将文档处理为层级结构,以展示其核心思想与分支。相较于纯文本,这种方式更有利于理解文档的逻辑与语义。近期,一种前沿方法将文档中的句子顺序编码,并通过序列到图的方式将其转换为关系图。尽管该方法能高效并行生成思维导图,但其机制更侧重于序列特征,难以捕获结构信息,同时难以建模长距离语义关系。本文提出一种共指引导的思维导图生成网络(CMGN),以融入外部结构知识。具体而言,我们基于共指语义关系构建共指关系图,引入图结构信息;随后采用共指关系图编码器挖掘句子间的潜在支配关系。为排除噪声并更有效利用共指关系图信息,我们采用对比学习方式引入图增强模块。实验结果表明,我们的模型优于现有所有方法。案例研究进一步证明,该模型能更准确、简洁地揭示文档的结构与语义。代码与数据发布在 https://github.com/Cyno2232/CMGN。