Natural Language Processing (NLP) is poised to substantially influence the world. However, significant progress comes hand-in-hand with substantial risks. Addressing them requires broad engagement with various fields of study. Yet, little empirical work examines the state of such engagement (past or current). In this paper, we quantify the degree of influence between 23 fields of study and NLP (on each other). We analyzed ~77k NLP papers, ~3.1m citations from NLP papers to other papers, and ~1.8m citations from other papers to NLP papers. We show that, unlike most fields, the cross-field engagement of NLP, measured by our proposed Citation Field Diversity Index (CFDI), has declined from 0.58 in 1980 to 0.31 in 2022 (an all-time low). In addition, we find that NLP has grown more insular -- citing increasingly more NLP papers and having fewer papers that act as bridges between fields. NLP citations are dominated by computer science; Less than 8% of NLP citations are to linguistics, and less than 3% are to math and psychology. These findings underscore NLP's urgent need to reflect on its engagement with various fields.
翻译:自然语言处理(NLP)正对世界产生深远影响。然而,重大进展往往伴随着显著风险。应对这些风险需要广泛借鉴不同学科领域的研究。然而,目前鲜有实证研究探讨此类跨领域互动的现状(无论是过去还是当前)。本文量化了23个学科领域与NLP之间的相互影响程度。我们分析了约7.7万篇NLP论文、约310万条NLP论文对其他领域论文的引用,以及约180万条其他领域论文对NLP论文的引用。研究发现,与大多数领域不同,NLP的跨领域互动程度(通过我们提出的引用领域多样性指数CFDI衡量)已从1980年的0.58下降至2022年的0.31(历史最低点)。此外,NLP领域正变得日益封闭——其引文中NLP论文占比持续上升,而能够连接不同领域的桥梁性论文数量不断减少。NLP的引用主要集中于计算机科学领域;对语言学的引用占比不足8%,对数学和心理学的引用合计不足3%。这些发现凸显了NLP领域亟需反思其与其他学科的互动关系。