Though discourse parsing can help multiple NLP fields, there has been no wide language model search done on implicit discourse relation classification. This hinders researchers from fully utilizing public-available models in discourse analysis. This work is a straightforward, fine-tuned discourse performance comparison of seven pre-trained language models. We use PDTB-3, a popular discourse relation annotated dataset. Through our model search, we raise SOTA to 0.671 ACC and obtain novel observations. Some are contrary to what has been reported before (Shi and Demberg, 2019b), that sentence-level pre-training objectives (NSP, SBO, SOP) generally fail to produce the best performing model for implicit discourse relation classification. Counterintuitively, similar-sized PLMs with MLM and full attention led to better performance.
翻译:尽管篇章解析可助力多个自然语言处理领域,但针对隐式篇章关系分类的大规模语言模型搜索仍属空白。这阻碍了研究者充分运用公开模型进行篇章分析。本研究对七种预训练语言模型进行了直接微调后的篇章性能对比,采用广泛使用的篇章关系标注数据集PDTB-3。通过模型搜索,我们将最优性能提升至0.671准确率,并获得了若干新发现。其中部分结果与既往报告(Shi and Demberg, 2019b)相悖:句子级预训练目标(NSP、SBO、SOP)通常无法为隐式篇章关系分类任务产生最优模型。反直觉的是,采用掩码语言建模与全局注意力机制的、同等规模的预训练语言模型反而表现出更优性能。