Linguistic ambiguity is and has always been one of the main challenges in Natural Language Processing (NLP) systems. Modern Transformer architectures like BERT, T5 or more recently InstructGPT have achieved some impressive improvements in many NLP fields, but there is still plenty of work to do. Motivated by the uproar caused by ChatGPT, in this paper we provide an introduction to linguistic ambiguity, its varieties and their relevance in modern NLP, and perform an extensive empiric analysis. ChatGPT strengths and weaknesses are revealed, as well as strategies to get the most of this model.
翻译:语言歧义性一直是自然语言处理(NLP)系统面临的主要挑战之一。以BERT、T5及最新出现的InstructGPT为代表的现代Transformer架构已在众多NLP领域取得显著进展,但仍有大量研究工作亟待开展。受ChatGPT引发的广泛关注所驱动,本文首先系统介绍语言歧义性及其各类表现形式在现代NLP中的重要性,随后进行大规模实证分析。通过研究揭示ChatGPT的优势与局限性,并探讨如何最大化发挥该模型效能的策略。