Humans develop world models that capture the underlying generation process of data. Whether neural networks can learn similar world models remains an open problem. In this work, we provide the first theoretical results for this problem, showing that in a multi-task setting, models with a low-degree bias provably recover latent data-generating variables under mild assumptions -- even if proxy tasks involve complex, non-linear functions of the latents. However, such recovery is also sensitive to model architecture. Our analysis leverages Boolean models of task solutions via the Fourier-Walsh transform and introduces new techniques for analyzing invertible Boolean transforms, which may be of independent interest. We illustrate the algorithmic implications of our results and connect them to related research areas, including self-supervised learning, out-of-distribution generalization, and the linear representation hypothesis in large language models.
翻译:人类发展出能够捕捉数据底层生成过程的世界模型。神经网络是否能够学习类似的世界模型仍是一个开放性问题。在本研究中,我们针对该问题提供了首个理论结果,证明在多任务设定下,具有低阶偏置的模型在温和假设下可证明地恢复潜在的数据生成变量——即使代理任务涉及潜在变量的复杂非线性函数。然而,这种恢复过程也对模型架构具有敏感性。我们的分析通过傅里叶-沃尔什变换构建任务解决方案的布尔模型,并引入了分析可逆布尔变换的新技术,这些技术可能具有独立的研究价值。我们阐释了研究结果的算法意义,并将其与相关研究领域——包括自监督学习、分布外泛化以及大语言模型中的线性表示假说——建立了理论联系。