The dominant artificial intelligence paradigm trains neural architectures via gradient descent against proxy objectives and reinforcement learning from human feedback. While remarkably capable, this top-down optimization inherently generates structural failure modes, including hallucination, sycophancy, reward hacking, and alignment fragility, which represent paradigmatic limitations rather than mere engineering defects. In response, we introduce RECLAIM (Recursive, Ecological, Cognitive, Lifelike, Adaptive, Intelligent Machine), a theoretical framework for cultivating intelligence through computational ecology rather than engineering it through strict optimization. The model is supported by four interlocking theoretical pillars. General Darwinism replaces gradients with blind variation and selective retention, while non-agentic emergence substitutes evaluative rewards with environmental physics to structurally prevent specification gaming against human intent. Concurrently, the Polya-Hebbian bridge applies Polya urn dynamics to Hebbian reinforcement for path-dependent specialization, and the free energy principle is integrated as environmental thermodynamics rather than as an agent objective. The architecture situates autopoietic units, bounded by Markov blankets and competing for finite computational energy, within a data ecology shaped by cognitive food chains and Red Queen arms races. This framework suggests the spontaneous emergence of dual-process cognition, sensory specialization, analogical reasoning, and intrinsic motivation as natural consequences of evolution under resource constraints. We conceptualize this paradigm transition as the OMEGA shift, representing a move from optimization and maximization to emergence through generative autopoiesis.
翻译:当前主流人工智能范式通过代理目标的梯度下降和人类反馈强化学习来训练神经架构。尽管这些方法展现出超群能力,但自上而下的优化本质会产生结构性失效模式,包括幻觉、谄媚、奖励黑客与对齐脆弱性——这些并非单纯工程缺陷,而是范式的根本局限。为此,我们提出RECLAIM(递归式·生态化·认知性·类生命化·自适应·智能机器)理论框架,主张通过计算生态培育智能,而非通过严格优化进行工程制造。该模型依托四大相互支撑的理论支柱:泛达尔文主义以盲目变异与选择性保留替代梯度优化;无主体涌现采用环境物理学替代评估性奖励,从结构层面杜绝违背人类意图的规范博弈;波利亚-赫布桥接将波利亚瓮动力学应用于赫布强化机制,实现路径依赖的特化进程;自由能原理被整合为环境热力学而非主体目标函数。架构将马尔可夫毯界定的自创生单元置于受认知食物链与红皇后军备竞赛塑造的数据生态中,使其为有限计算能量展开竞争。该框架证明,双过程认知、感官特化、类比推理与内在动机等特性,均是在资源约束下的进化进程中自然涌现的结果。我们将这一范式转换概念化为欧米伽转向,代表着从优化与最大化向生成式自创生涌现的范式跃迁。