Machines that can replicate human intelligence with type 2 reasoning capabilities should be able to reason at multiple levels of spatio-temporal abstractions and scales using internal world models. Devising formalisms to develop such internal world models, which accurately reflect the causal hierarchies inherent in the dynamics of the real world, is a critical research challenge in the domains of artificial intelligence and machine learning. This thesis identifies several limitations with the prevalent use of state space models (SSMs) as internal world models and propose two new probabilistic formalisms namely Hidden-Parameter SSMs and Multi-Time Scale SSMs to address these drawbacks. The structure of graphical models in both formalisms facilitates scalable exact probabilistic inference using belief propagation, as well as end-to-end learning via backpropagation through time. This approach permits the development of scalable, adaptive hierarchical world models capable of representing nonstationary dynamics across multiple temporal abstractions and scales. Moreover, these probabilistic formalisms integrate the concept of uncertainty in world states, thus improving the system's capacity to emulate the stochastic nature of the real world and quantify the confidence in its predictions. The thesis also discuss how these formalisms are in line with related neuroscience literature on Bayesian brain hypothesis and predicitive processing. Our experiments on various real and simulated robots demonstrate that our formalisms can match and in many cases exceed the performance of contemporary transformer variants in making long-range future predictions. We conclude the thesis by reflecting on the limitations of our current models and suggesting directions for future research.
翻译:具备类型二推理能力、能够复现人类智能的机器,应能利用内部世界模型在多个时空抽象层次和尺度上进行推理。开发能准确反映现实世界动态中固有因果层次结构的内部世界模型,是人工智能与机器学习领域的一项关键研究挑战。本论文指出了当前广泛使用的状态空间模型(SSMs)作为内部世界模型存在的若干局限性,并提出了两种新的概率形式——隐参数SSM(Hidden-Parameter SSMs)和多时间尺度SSM(Multi-Time Scale SSMs),以解决这些缺陷。这两种形式的图模型结构,通过置信传播实现了可扩展的精确概率推理,并支持通过时间反向传播进行端到端学习。该方法使得开发可扩展、自适应的层次世界模型成为可能,这些模型能够表示跨多个时间抽象和尺度的非平稳动态。此外,这些概率形式整合了世界状态中的不确定性概念,从而提升了系统模拟现实世界随机性及量化其预测置信度的能力。本论文还讨论了这些形式与贝叶斯大脑假说及预测处理相关神经科学文献的契合之处。我们在各类真实与模拟机器人上的实验表明,我们的形式在长程未来预测方面的性能可媲美甚至超越当代Transformer变体。最后,我们反思了当前模型的局限性,并提出了未来研究方向。