This work introduces approaches to assessing phrase breaks in ESL learners' speech using pre-trained language models (PLMs) and large language models (LLMs). There are two tasks: overall assessment of phrase break for a speech clip and fine-grained assessment of every possible phrase break position. To leverage NLP models, speech input is first force-aligned with texts, and then pre-processed into a token sequence, including words and phrase break information. To utilize PLMs, we propose a pre-training and fine-tuning pipeline with the processed tokens. This process includes pre-training with a replaced break token detection module and fine-tuning with text classification and sequence labeling. To employ LLMs, we design prompts for ChatGPT. The experiments show that with the PLMs, the dependence on labeled training data has been greatly reduced, and the performance has improved. Meanwhile, we verify that ChatGPT, a renowned LLM, has potential for further advancement in this area.
翻译:本文介绍了利用预训练语言模型(PLMs)和大语言模型(LLMs)评估非母语英语学习者口语中短语停顿的方法。研究包含两个任务:对语音片段中短语停顿的整体评估,以及对每个可能的短语停顿位置进行细粒度评估。为利用自然语言处理模型,首先对语音输入进行与文本的强制对齐,然后将其预处理为包含单词和短语停顿信息的标记序列。为使用预训练语言模型,我们提出了一种基于处理后标记的预训练与微调流程,该流程包括通过替换停顿标记检测模块进行预训练,以及通过文本分类和序列标注任务进行微调。为应用大语言模型,我们为ChatGPT设计了提示词。实验结果表明,借助预训练语言模型,对标注训练数据的依赖显著降低,且性能得到提升。同时,我们验证了知名大语言模型ChatGPT在该领域具有进一步发展的潜力。