从开普勒到牛顿：归纳偏置引导Transformer中的学习世界模型 (From Kepler to Newton: Inductive Biases Guide Learned World Models in Transformers)

Can general-purpose AI architectures go beyond prediction to discover the physical laws governing the universe? True intelligence relies on "world models" -- causal abstractions that allow an agent to not only predict future states but understand the underlying governing dynamics. While previous "AI Physicist" approaches have successfully recovered such laws, they typically rely on strong, domain-specific priors that effectively "bake in" the physics. Conversely, Vafa et al. recently showed that generic Transformers fail to acquire these world models, achieving high predictive accuracy without capturing the underlying physical laws. We bridge this gap by systematically introducing three minimal inductive biases. We show that ensuring spatial smoothness (by formulating prediction as continuous regression) and stability (by training with noisy contexts to mitigate error accumulation) enables generic Transformers to surpass prior failures and learn a coherent Keplerian world model, successfully fitting ellipses to planetary trajectories. However, true physical insight requires a third bias: temporal locality. By restricting the attention window to the immediate past -- imposing the simple assumption that future states depend only on the local state rather than a complex history -- we force the model to abandon curve-fitting and discover Newtonian force representations. Our results demonstrate that simple architectural choices determine whether an AI becomes a curve-fitter or a physicist, marking a critical step toward automated scientific discovery.

翻译：通用人工智能架构能否超越预测，进而发现支配宇宙的物理定律？真正的智能依赖于"世界模型"——这种因果抽象不仅能让智能体预测未来状态，更能理解其背后的支配动力学。虽然先前"AI物理学家"方法已成功复原此类定律，但它们通常依赖于强领域先验，实质上已将物理学知识"内嵌"其中。相反，Vafa等人近期研究表明，通用Transformer无法习得这些世界模型，虽能实现高预测精度却未能捕捉底层物理定律。我们通过系统引入三种最小化归纳偏置来弥合这一鸿沟。研究表明：确保空间平滑性（通过将预测构建为连续回归问题）和稳定性（通过噪声上下文训练以减轻误差累积）能使通用Transformer克服先前缺陷，成功学习开普勒世界模型，精确拟合行星轨道的椭圆轨迹。然而，真正的物理洞察需要第三种偏置：时间局部性。通过将注意力窗口限制在近期历史——施加"未来状态仅依赖于局部状态而非复杂历史"这一简单假设——我们迫使模型放弃曲线拟合，进而发现牛顿力学的力表征。我们的研究证明，简单的架构选择决定了人工智能成为曲线拟合工具还是物理学家，这标志着向自动化科学发现迈出了关键一步。