Abstract Meaning Representation (AMR) parsing aims to extract an abstract semantic graph from a given sentence. The sequence-to-sequence approaches, which linearize the semantic graph into a sequence of nodes and edges and generate the linearized graph directly, have achieved good performance. However, we observed that these approaches suffer from structure loss accumulation during the decoding process, leading to a much lower F1-score for nodes and edges decoded later compared to those decoded earlier. To address this issue, we propose a novel Reverse Graph Linearization (RGL) enhanced framework. RGL defines both default and reverse linearization orders of an AMR graph, where most structures at the back part of the default order appear at the front part of the reversed order and vice versa. RGL incorporates the reversed linearization to the original AMR parser through a two-pass self-distillation mechanism, which guides the model when generating the default linearizations. Our analysis shows that our proposed method significantly mitigates the problem of structure loss accumulation, outperforming the previously best AMR parsing model by 0.8 and 0.5 Smatch scores on the AMR 2.0 and AMR 3.0 dataset, respectively. The code are available at https://github.com/pkunlp-icler/AMR_reverse_graph_linearization.
翻译:抽象语义表示(AMR)解析旨在从给定句子中提取抽象语义图。序列到序列方法通过将语义图线性化为节点和边的序列并直接生成线性化图,已取得良好性能。然而,我们观察到这些方法在解码过程中存在结构损失累积问题,导致后解码节点和边的F1分数远低于先解码节点和边。为解决此问题,我们提出一种新颖的逆向图线性化(RGL)增强框架。RGL定义了AMR图的默认与逆向线性化顺序,其中默认顺序后部的大部分结构出现在逆向顺序前部,反之亦然。RGL通过两遍自蒸馏机制将逆向线性化融入原始AMR解析器,在模型生成默认线性化时提供引导。分析表明,所提方法显著缓解了结构损失累积问题,在AMR 2.0和AMR 3.0数据集上分别以0.8和0.5的Smatch分数超越此前最优AMR解析模型。代码已开源至https://github.com/pkunlp-icler/AMR_reverse_graph_linearization。