Advancing scientific understanding through mechanistic modeling requires posing the right experimental questions to yield maximally informative data. To automate this pursuit within cognitive science, we introduce ATLAS (Active Theory Learning for Automated Science), an active learning framework for the data-driven discovery of interpretable behavioral models. ATLAS iterates between generating mechanistic hypotheses--instantiated as a diverse ensemble of sparse neural networks (Disentangled RNNs)--and designing experiments that optimally distinguish between them. We test this approach on the problem of recovering reinforcement learning agents from their behavior in bandit tasks. ATLAS designs varied sequences of qualitatively novel experiments with temporal structure tailored to underlying agent characteristics. The models trained on these experiments are evaluated against a comprehensive set of metrics for mechanistic modeling that capture behavioral, structural, and computational similarity. ATLAS achieves a 5-10x improvement in sample efficiency across all metrics compared to random experimentation, and its performance is further validated against expert-designed experiments derived from literature. These in silico results showcase ATLAS's potential to accelerate human-interpretable insights in cognitive science and other domains where scientific inquiry relies on discovering mechanistic models.
翻译:通过机制建模推动科学理解,需要提出正确的实验问题以获取信息量最大的数据。为实现认知科学中这一过程的自动化,我们提出ATLAS(面向自动化科学的主动理论学习),一种用于数据驱动可解释行为模型发现的主动学习框架。ATLAS在生成机制性假设(以多样化稀疏神经网络集成——解耦RNN实现)与设计最优区分这些假设的实验之间迭代推进。我们将该方法应用于从赌博任务行为中恢复强化学习智能体的任务。ATLAS设计出具有时间结构的定性新颖实验序列,该结构根据底层智能体特征定制。基于这些实验训练的模型,通过涵盖行为相似性、结构相似性和计算相似性的综合机制建模度量集进行评估。与随机实验相比,ATLAS在所有度量上的样本效率提升5-10倍,其性能进一步通过与文献中专家设计实验的对比得到验证。这些计算机模拟结果展示了ATLAS在认知科学及其他依赖发现机制模型的科学领域中加速生成人类可解释洞见的潜力。