We investigate the effectiveness of GPT-3.5 and GPT-4, two large language models, as Grammatical Error Correction (GEC) tools for Brazilian Portuguese and compare their performance against Microsoft Word and Google Docs. We introduce a GEC dataset for Brazilian Portuguese with four categories: Grammar, Spelling, Internet, and Fast typing. Our results show that while GPT-4 has higher recall than other methods, LLMs tend to have lower precision, leading to overcorrection. This study demonstrates the potential of LLMs as practical GEC tools for Brazilian Portuguese and encourages further exploration of LLMs for non-English languages and other educational settings.
翻译:摘要:我们研究了两种大型语言模型——GPT-3.5与GPT-4作为巴西葡萄牙语语法错误修正(GEC)工具的有效性,并将其性能与Microsoft Word和Google Docs进行了对比。我们引入了一个包含四类错误的巴西葡萄牙语GEC数据集:语法、拼写、网络用语及快速打字。结果显示,尽管GPT-4的召回率高于其他方法,但大型语言模型(LLM)的精确率通常较低,导致过度修正问题。本研究展示了LLM作为巴西葡萄牙语实际GEC工具的潜力,并鼓励进一步探索LLM在非英语语言及其他教育场景中的应用。