Recent studies have provided empirical evidence of the wide-ranging potential of Generative Pre-trained Transformer (GPT), a pretrained language model, in the field of natural language processing. GPT has been effectively employed as a decoder within state-of-the-art (SOTA) question answering systems, yielding exceptional performance across various tasks. However, the current research landscape concerning GPT's application in Vietnamese remains limited. This paper aims to address this gap by presenting an implementation of GPT-2 for community-based question answering specifically focused on COVID-19 related queries in Vietnamese. We introduce a novel approach by conducting a comparative analysis of different Transformers vs SOTA models in the community-based COVID-19 question answering dataset. The experimental findings demonstrate that the GPT-2 models exhibit highly promising outcomes, outperforming other SOTA models as well as previous community-based COVID-19 question answering models developed for Vietnamese.
翻译:近期研究为生成式预训练Transformer(GPT)这一预训练语言模型在自然语言处理领域的广泛潜力提供了实证依据。GPT已被有效用作最先进问答系统中的解码器,在各任务中展现出卓越性能。然而,目前关于GPT在越南语中应用的研究仍较为有限。本文旨在填补这一空白,通过实现GPT-2模型,专门针对越南语新冠相关社区问答任务展开研究。我们提出了一种创新方法,在社区新冠问答数据集中对不同Transformer与最先进模型进行对比分析。实验结果表明,GPT-2模型展现出极具前景的成果,其性能不仅优于其他最先进模型,也超越了此前为越南语开发的社区新冠问答模型。