Developments in the field of Artificial Intelligence (AI), and particularly large language models (LLMs), have created a 'perfect storm' for observing 'sparks' of Artificial General Intelligence (AGI) that are spurious. Like simpler models, LLMs distill meaningful representations in their latent embeddings that have been shown to correlate with external variables. Nonetheless, the correlation of such representations has often been linked to human-like intelligence in the latter but not the former. We probe models of varying complexity including random projections, matrix decompositions, deep autoencoders and transformers: all of them successfully distill information that can be used to predict latent or external variables and yet none of them have previously been linked to AGI. We argue and empirically demonstrate that the finding of meaningful patterns in latent spaces of models cannot be seen as evidence in favor of AGI. Additionally, we review literature from the social sciences that shows that humans are prone to seek such patterns and anthropomorphize. We conclude that both the methodological setup and common public image of AI are ideal for the misinterpretation that correlations between model representations and some variables of interest are 'caused' by the model's understanding of underlying 'ground truth' relationships. We, therefore, call for the academic community to exercise extra caution, and to be keenly aware of principles of academic integrity, in interpreting and communicating about AI research outcomes.
翻译:人工智能(AI)领域的发展,特别是大型语言模型(LLMs)的进步,为观测虚假的人工通用智能(AGI)“火花”创造了“完美风暴”。与较简单的模型类似,LLMs在其潜在嵌入中提取出有意义的表征,这些表征已被证明与外部变量相关。然而,这种表征的相关性常被与类人智能联系起来,但仅限于后者而非前者。我们探究了包括随机投影、矩阵分解、深度自编码器和Transformer在内的不同复杂度的模型:它们都能成功提取可用于预测潜在或外部变量的信息,但此前无一被与AGI相关联。我们认为并通过实证证明,在模型的潜在空间中发现有意义的模式不能被视为支持AGI的证据。此外,我们回顾了社会科学文献,表明人类倾向于寻找此类模式并拟人化。我们的结论是,无论是方法论设置还是AI的公共形象,都极易导致误解——即模型表征与某些感兴趣变量之间的相关性是由模型对潜在“真实”关系的“理解”所“引起”的。因此,我们呼吁学术界在解释和传播AI研究成果时格外谨慎,并敏锐地意识到学术诚信原则。