Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules

Humans excel at discovering regular structures from limited samples and applying inferred rules to novel settings. We investigate whether modern generative models can similarly learn underlying rules from finite samples and perform reasoning through conditional sampling. Inspired by Raven's Progressive Matrices task, we designed GenRAVEN dataset, where each sample consists of three rows, and one of 40 relational rules governing the object position, number, or attributes applies to all rows. We trained generative models to learn the data distribution, where samples are encoded as integer arrays to focus on rule learning. We compared two generative model families: diffusion (EDM, DiT, SiT) and autoregressive models (GPT2, Mamba). We evaluated their ability to generate structurally consistent samples and perform panel completion via unconditional and conditional sampling. We found diffusion models excel at unconditional generation, producing more novel and consistent samples from scratch and memorizing less, but performing less well in panel completion, even with advanced conditional sampling methods. Conversely, autoregressive models excel at completing missing panels in a rule-consistent manner but generate less consistent samples unconditionally. We observe diverse data scaling behaviors: for both model families, rule learning emerges at a certain dataset size - around 1000s examples per rule. With more training data, diffusion models improve both their unconditional and conditional generation capabilities. However, for autoregressive models, while panel completion improves with more training data, unconditional generation consistency declines. Our findings highlight complementary capabilities and limitations of diffusion and autoregressive models in rule learning and reasoning tasks, suggesting avenues for further research into their mechanisms and potential for human-like reasoning.

翻译：人类擅长从有限样本中发现规律性结构，并将推断出的规则应用于新情境。本研究探讨现代生成模型是否同样能从有限样本中学习底层规则，并通过条件采样进行推理。受瑞文渐进矩阵任务启发，我们设计了GenRAVEN数据集，其中每个样本包含三行，且所有行均受40种关系规则（控制物体位置、数量或属性）中的某一条支配。我们训练生成模型学习数据分布，样本被编码为整数数组以聚焦规则学习。比较了两类生成模型家族：扩散模型（EDM、DiT、SiT）和自回归模型（GPT2、Mamba）。评估了它们生成结构一致性样本的能力，以及通过无条件采样和条件采样完成矩阵面板的能力。研究发现：扩散模型在无条件生成方面表现卓越，能从头生成更具新颖性和一致性的样本且记忆效应更弱，但在面板补全任务中表现较差，即使采用先进的条件采样方法亦如此；相反，自回归模型能以规则一致的方式有效补全缺失面板，但无条件生成的样本一致性较低。我们观察到多样化的数据扩展行为：两类模型家族均在特定数据集规模（约每规则1000个样本）开始显现规则学习能力。随着训练数据增加，扩散模型的无条件生成和条件生成能力同步提升；然而对于自回归模型，面板补全能力虽随数据量增加而改善，但无条件生成的一致性反而下降。本研究揭示了扩散模型与自回归模型在规则学习与推理任务中互补的能力与局限，为深入探索其内在机制及实现类人推理的潜力指明了研究方向。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日