Natural Language Processing (NLP) is poised to substantially influence the world. However, significant progress comes hand-in-hand with substantial risks. Addressing them requires broad engagement with various fields of study. Yet, little empirical work examines the state of such engagement (past or current). In this paper, we quantify the degree of influence between 23 fields of study and NLP (on each other). We analyzed ~77k NLP papers, ~3.1m citations from NLP papers to other papers, and ~1.8m citations from other papers to NLP papers. We show that, unlike most fields, the cross-field engagement of NLP, measured by our proposed Citation Field Diversity Index (CFDI), has declined from 0.58 in 1980 to 0.31 in 2022 (an all-time low). In addition, we find that NLP has grown more insular -- citing increasingly more NLP papers and having fewer papers that act as bridges between fields. NLP citations are dominated by computer science; Less than 8% of NLP citations are to linguistics, and less than 3% are to math and psychology. These findings underscore NLP's urgent need to reflect on its engagement with various fields.
翻译:自然语言处理(NLP)有望对世界产生重大影响。然而,重大进展总是伴随着重大风险。应对这些风险需要与多个研究领域进行广泛互动。然而,鲜有实证研究考察这种互动的现状(无论过去还是当前)。本文量化了23个研究领域与NLP之间(相互)的影响程度。我们分析了约7.7万篇NLP论文、约310万条NLP论文对其它论文的引用,以及约180万条其它论文对NLP论文的引用。研究表明,与大多数领域不同,NLP的跨领域互动程度——以我们提出的引用领域多样性指数(CFDI)衡量——已从1980年的0.58下降至2022年的0.31(创历史新低)。此外,我们还发现NLP日益封闭——引用NLP论文的比例持续上升,而充当领域间桥梁的论文数量不断减少。NLP引用中计算机科学占据主导地位;NLP引用中语言学占比不足8%,数学和心理学占比不足3%。这些发现凸显了NLP急需反思其与各领域的互动现状。