Direct dependency parsing of the speech signal -- as opposed to parsing speech transcriptions -- has recently been proposed as a task (Pupier et al. 2022), as a way of incorporating prosodic information in the parsing system and bypassing the limitations of a pipeline approach that would consist of using first an Automatic Speech Recognition (ASR) system and then a syntactic parser. In this article, we report on a set of experiments aiming at assessing the performance of two parsing paradigms (graph-based parsing and sequence labeling based parsing) on speech parsing. We perform this evaluation on a large treebank of spoken French, featuring realistic spontaneous conversations. Our findings show that (i) the graph based approach obtain better results across the board (ii) parsing directly from speech outperforms a pipeline approach, despite having 30% fewer parameters.
翻译:直接对语音信号进行依存解析——而非解析语音转录文本——最近被提出作为一个任务(Pupier等人,2022),旨在将韵律信息纳入解析系统,并规避流水线方法的局限性,该方法通常先使用自动语音识别(ASR)系统,再使用句法解析器。在本文中,我们报告了一系列实验,旨在评估两种解析范式(基于图的解析和基于序列标注的解析)在语音解析上的性能。我们在一个包含现实自发对话的大规模法语口语树库上进行了此项评估。我们的研究结果表明:(i)基于图的方法在所有指标上都取得了更好的结果;(ii)直接从语音进行解析优于流水线方法,尽管其参数数量少了30%。