Natural Language Processing (NLP) is poised to substantially influence the world. However, significant progress comes hand-in-hand with substantial risks. Addressing them requires broad engagement with various fields of study. Yet, little empirical work examines the state of such engagement (past or current). In this paper, we quantify the degree of influence between 23 fields of study and NLP (on each other). We analyzed ~77k NLP papers, ~3.1m citations from NLP papers to other papers, and ~1.8m citations from other papers to NLP papers. We show that, unlike most fields, the cross-field engagement of NLP, measured by our proposed Citation Field Diversity Index (CFDI), has declined from 0.58 in 1980 to 0.31 in 2022 (an all-time low). In addition, we find that NLP has grown more insular -- citing increasingly more NLP papers and having fewer papers that act as bridges between fields. NLP citations are dominated by computer science; Less than 8% of NLP citations are to linguistics, and less than 3% are to math and psychology. These findings underscore NLP's urgent need to reflect on its engagement with various fields.
翻译:自然语言处理(NLP)正对世界产生深远影响。然而,重大进展往往伴随着显著风险。要应对这些风险,需要广泛融合多学科研究。但目前鲜有实证研究探讨这种融合的现状(无论是过去还是当前)。本文量化了23个研究领域与NLP之间的相互影响程度。我们分析了约7.7万篇NLP论文、约310万条NLP论文对其他领域论文的引用,以及约180万条其他领域论文对NLP论文的引用。研究发现:与大多数领域不同,NLP的跨领域融合度(通过我们提出的引用领域多样性指数CFDI衡量)已从1980年的0.58下降至2022年的0.31(历史最低点)。此外,NLP正变得日益封闭——其引文中NLP论文占比持续上升,而充当领域间桥梁的论文数量不断减少。NLP的引用主要集中于计算机科学领域;对语言学的引用不足8%,对数学和心理学的引用均低于3%。这些发现凸显了NLP亟需反思其与其他领域的融合现状。