Rooted in the explosion of deep learning over the past decade, this thesis spans from AlphaGo to ChatGPT to empirically examine the fundamental concepts needed to realize the vision of an artificial scientist: a machine with the capacity to autonomously generate original research and contribute to the expansion of human knowledge. The investigation begins with Olivaw, an AlphaGo Zero-like agent that discovers Othello knowledge from scratch but is unable to communicate it. This realization leads to the development of the Explanatory Learning (EL) framework, a formalization of the problem faced by a scientist when trying to explain a new phenomenon to their peers. The effective EL prescriptions allow us to crack Zendo, a popular board game simulating the scientific endeavor. This success comes with a fundamental insight: an artificial scientist must develop its own interpretation of the language used to explain its findings, and not rely on a rigid existing interpreter. Questioning the very process of learning an interpreter, we turn our attention to the inner functioning of modern multimodal models. This culminates in a simple idea to build CLIP-like models where interpretation and perception are explicitly disentangled: a cost-effective approach that couples two unimodal models using little multimodal data and no further training. Finally, we discuss what ChatGPT and its siblings are still missing to become artificial scientists, and introduce the Big-Bench Symbol Interpretation Task, a benchmark about interpreting Zendo-like explanations that sees LLMs going no further than random chance while being instead fully solved by humans.
翻译:本论文立足于过去十年深度学习的蓬勃发展,从AlphaGo到ChatGPT,通过实证研究探讨实现人工科学家愿景所需的基本概念:即一种能够自主开展原创研究、推动人类知识拓展的智能系统。研究始于Olivaw——一个类似AlphaGo Zero的智能体,它能够从零开始发现黑白棋知识,却无法将其表述传达。这一发现促使我们建立了"解释性学习"框架,该框架将科学家向同行阐释新现象时面临的问题进行了形式化建模。有效的解释性学习方案使我们成功破解了Zendo——一款模拟科研探索的经典棋盘游戏。此项突破带来一个根本性洞见:人工科学家必须建立对解释性语言的自主理解体系,而非依赖固化的现有解释器。通过对解释器学习过程的本质性质疑,我们将研究焦点转向现代多模态模型的内部工作机制,最终提出构建类CLIP模型的简洁方案:通过低成本耦合两个单模态模型,在仅需少量多模态数据且无需额外训练的条件下,实现感知与解释的显式解耦。最后,我们探讨了ChatGPT及其同类系统迈向人工科学家所欠缺的关键能力,并引入Big-Bench符号解释任务——该基准测试要求对类Zendo解释进行语义解析,实验表明大型语言模型在此任务上的表现仅处于随机猜测水平,而人类却能完全掌握其解决机制。