Dialogue discourse parsing aims to uncover the internal structure of a multi-participant conversation by finding all the discourse~\emph{links} and corresponding~\emph{relations}. Previous work either treats this task as a series of independent multiple-choice problems, in which the link existence and relations are decoded separately, or the encoding is restricted to only local interaction, ignoring the holistic structural information. In contrast, we propose a principled method that improves upon previous work from two perspectives: encoding and decoding. From the encoding side, we perform structured encoding on the adjacency matrix followed by the matrix-tree learning algorithm, where all discourse links and relations in the dialogue are jointly optimized based on latent tree-level distribution. From the decoding side, we perform structured inference using the modified Chiu-Liu-Edmonds algorithm, which explicitly generates the labeled multi-root non-projective spanning tree that best captures the discourse structure. In addition, unlike in previous work, we do not rely on hand-crafted features; this improves the model's robustness. Experiments show that our method achieves new state-of-the-art, surpassing the previous model by 2.3 on STAC and 1.5 on Molweni (F1 scores). \footnote{Code released at~\url{https://github.com/chijames/structured_dialogue_discourse_parsing}.}
翻译:对话篇章解析旨在通过发现所有篇章链接及其对应关系,揭示多参与者对话的内部结构。以往研究要么将该任务视为一系列独立的多选题,其中链接存在性与关系被分别解码,要么将编码局限于局部交互,忽略了整体结构信息。相比之下,我们提出了一种原则性方法,从编码与解码两个维度对以往工作进行改进。在编码方面,我们对邻接矩阵进行结构化编码,并辅以矩阵树学习算法,使对话中所有篇章链接与关系基于隐式树级分布实现联合优化。在解码方面,我们采用改进的Chiu-Liu-Edmonds算法进行结构化推理,该算法能显式生成最能捕捉对话结构的带标签多根非投射生成树。此外,与以往工作不同,我们无需依赖手工特征,从而提升了模型的鲁棒性。实验表明,我们的方法取得了新的最优性能,在STAC上F1分数较先前模型提升2.3,在Molweni上提升1.5。\footnote{代码开源于~\url{https://github.com/chijames/structured_dialogue_discourse_parsing}。}