We investigate the effectiveness of GPT-3.5 and GPT-4, two large language models, as Grammatical Error Correction (GEC) tools for Brazilian Portuguese and compare their performance against Microsoft Word and Google Docs. We introduce a GEC dataset for Brazilian Portuguese with four categories: Grammar, Spelling, Internet, and Fast typing. Our results show that while GPT-4 has higher recall than other methods, LLMs tend to have lower precision, leading to overcorrection. This study demonstrates the potential of LLMs as practical GEC tools for Brazilian Portuguese and encourages further exploration of LLMs for non-English languages and other educational settings.
翻译:摘要:我们研究了两种大型语言模型GPT-3.5与GPT-4作为巴西葡萄牙语语法错误纠正(GEC)工具的有效性,并将其性能与Microsoft Word和Google Docs进行了对比。我们引入了一个包含四个类别(语法、拼写、网络用语与快速输入)的巴葡GEC数据集。结果表明,尽管GPT-4的召回率高于其他方法,但大语言模型(LLMs)的精确率普遍较低,从而导致过度纠正。本研究展示了LLMs作为巴葡GEC实用工具的潜力,并鼓励进一步探索其在非英语语言及其他教育场景中的应用。