Big Tech influence over AI research revisited: memetic analysis of attribution of ideas to affiliation

There exists a growing discourse around the domination of Big Tech on the landscape of artificial intelligence (AI) research, yet our comprehension of this phenomenon remains cursory. This paper aims to broaden and deepen our understanding of Big Tech's reach and power within AI research. It highlights the dominance not merely in terms of sheer publication volume but rather in the propagation of new ideas or \textit{memes}. Current studies often oversimplify the concept of influence to the share of affiliations in academic papers, typically sourced from limited databases such as arXiv or specific academic conferences. The main goal of this paper is to unravel the specific nuances of such influence, determining which AI ideas are predominantly driven by Big Tech entities. By employing network and memetic analysis on AI-oriented paper abstracts and their citation network, we are able to grasp a deeper insight into this phenomenon. By utilizing two databases: OpenAlex and S2ORC, we are able to perform such analysis on a much bigger scale than previous attempts. Our findings suggest, that while Big Tech-affiliated papers are disproportionately more cited in some areas, the most cited papers are those affiliated with both Big Tech and Academia. Focusing on the most contagious memes, their attribution to specific affiliation groups (Big Tech, Academia, mixed affiliation) seems to be equally distributed between those three groups. This suggests that the notion of Big Tech domination over AI research is oversimplified in the discourse. Ultimately, this more nuanced understanding of Big Tech's and Academia's influence could inform a more symbiotic alliance between these stakeholders which would better serve the dual goals of societal welfare and the scientific integrity of AI research.

翻译：关于科技巨头在人工智能（AI）研究领域主导地位的讨论日益增多，但我们对这一现象的理解仍停留在表面。本文旨在拓宽并深化对科技巨头在AI研究中覆盖范围与影响力的认知，强调其主导性不仅体现在论文发表数量上，更体现在新思想或"模因"的传播中。当前研究常将影响力简化为学术论文中机构作者占比，且通常仅基于arXiv或特定学术会议等有限数据库。本文的核心目标在于揭示这种影响力的具体特征，明确哪些AI思想主要由科技巨头推动。通过对AI领域论文摘要及其引文网络进行网络分析与模因分析，我们能够更深入地洞察这一现象。借助OpenAlex与S2ORC两个数据库，我们实现了比以往研究更大规模的分析。研究结果表明：尽管科技巨头附属论文在某些领域被引频次更高，但被引次数最多的论文往往由科技巨头与学术界共同发表。聚焦最具传播性的模因时，其归属（科技巨头、学术界、混合机构）在三类群体中的分布似乎趋于均衡。这表明，关于科技巨头主导AI研究的说法在现有讨论中存在过度简化之嫌。最终，这种对科技巨头与学术界影响力的更深层次理解，或将推动双方形成更具共生性的合作关系，从而更好地实现AI研究的社会福祉与科学诚信这一双重目标。