Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective

Machines that can replicate human intelligence with type 2 reasoning capabilities should be able to reason at multiple levels of spatio-temporal abstractions and scales using internal world models. Devising formalisms to develop such internal world models, which accurately reflect the causal hierarchies inherent in the dynamics of the real world, is a critical research challenge in the domains of artificial intelligence and machine learning. This thesis identifies several limitations with the prevalent use of state space models (SSMs) as internal world models and propose two new probabilistic formalisms namely Hidden-Parameter SSMs and Multi-Time Scale SSMs to address these drawbacks. The structure of graphical models in both formalisms facilitates scalable exact probabilistic inference using belief propagation, as well as end-to-end learning via backpropagation through time. This approach permits the development of scalable, adaptive hierarchical world models capable of representing nonstationary dynamics across multiple temporal abstractions and scales. Moreover, these probabilistic formalisms integrate the concept of uncertainty in world states, thus improving the system's capacity to emulate the stochastic nature of the real world and quantify the confidence in its predictions. The thesis also discuss how these formalisms are in line with related neuroscience literature on Bayesian brain hypothesis and predicitive processing. Our experiments on various real and simulated robots demonstrate that our formalisms can match and in many cases exceed the performance of contemporary transformer variants in making long-range future predictions. We conclude the thesis by reflecting on the limitations of our current models and suggesting directions for future research.

翻译：具备类型二推理能力、能够复现人类智能的机器，应能利用内部世界模型在多个时空抽象层次和尺度上进行推理。开发能准确反映现实世界动态中固有因果层次结构的内部世界模型，是人工智能与机器学习领域的一项关键研究挑战。本论文指出了当前广泛使用的状态空间模型（SSMs）作为内部世界模型存在的若干局限性，并提出了两种新的概率形式——隐参数SSM（Hidden-Parameter SSMs）和多时间尺度SSM（Multi-Time Scale SSMs），以解决这些缺陷。这两种形式的图模型结构，通过置信传播实现了可扩展的精确概率推理，并支持通过时间反向传播进行端到端学习。该方法使得开发可扩展、自适应的层次世界模型成为可能，这些模型能够表示跨多个时间抽象和尺度的非平稳动态。此外，这些概率形式整合了世界状态中的不确定性概念，从而提升了系统模拟现实世界随机性及量化其预测置信度的能力。本论文还讨论了这些形式与贝叶斯大脑假说及预测处理相关神经科学文献的契合之处。我们在各类真实与模拟机器人上的实验表明，我们的形式在长程未来预测方面的性能可媲美甚至超越当代Transformer变体。最后，我们反思了当前模型的局限性，并提出了未来研究方向。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日