This paper provides the first discourse parsing experiments with a large language model (LLM) finetuned on corpora annotated in the style of SDRT (Asher, 1993; Asher and Lascarides, 2003). The result is a discourse parser, LLaMIPa (LLaMA Incremental Parser), which is able to more fully exploit discourse context, leading to substantial performance gains over approaches that use encoder-only models to provide local, context-sensitive representations of discourse units. Furthermore, it is able to process discourse data incrementally, which is essential for the eventual use of discourse information in downstream tasks.
翻译:本文首次利用基于SDRT风格标注语料库微调的大型语言模型(LLM)进行了篇章解析实验。由此产生的篇章解析器LLaMIPa(LLaMA增量式解析器)能够更充分地利用篇章上下文,相较于使用仅编码器模型提供局部、上下文敏感的篇章单元表示的方法,实现了显著的性能提升。此外,该解析器能够以增量方式处理篇章数据,这对于在下游任务中最终利用篇章信息至关重要。