We introduce HypoExplore, an agentic framework that formulates neural architecture discovery for visual recognition as a hypothesis-driven scientific inquiry. Given a human-specified high-level research direction, HypoExplore ideates, implements, evaluates, and improves neural architectures through evolutionary branching. New hypotheses are created using a large language model by selecting a parent hypothesis to build upon, guided by a dual strategy that balances exploiting validated principles with resolving uncertain ones. Our proposed framework maintains a Trajectory Tree that records the lineage of all proposed architectures, and a Hypothesis Memory Bank that actively tracks confidence scores acquired through experimental evidence. After each experiment, multiple feedback agents analyze the results from different perspectives and consolidate their findings into hypothesis confidence updates. Our framework is tested on discovering lightweight vision architectures on CIFAR-10, with the best achieving 94.11% accuracy evolved from a root node baseline that starts at 18.91%, and generalizes to CIFAR-100 and Tiny-ImageNet. We further demonstrate applicability to a specialized domain by conducting independent architecture discovery runs on MedMNIST, which yield a state-of-the-art performance. We show that hypothesis confidence scores grow increasingly predictive as evidence accumulates, and that the learned principles transfer across independent evolutionary lineages, suggesting that HypoExplore not only discovers stronger architectures, but can help build a genuine understanding of the design space.
翻译:我们提出HypoExplore——一种将视觉识别神经架构发现形式化为假设驱动科学探究的智能体框架。给定人类指定的高层研究方向,HypoExplore通过进化分支对神经架构进行构思、实现、评估与改进。新假设通过大型语言模型生成:基于双元策略(平衡已验证原理的利用与未确定性假设的消解)选择父假设进行衍生。该框架维护记录所有架构谱系的轨迹树,以及通过实验证据主动追踪置信度得分的假设记忆库。每次实验后,多个反馈智能体从不同视角分析结果,并将发现整合为假设置信度更新。在CIFAR-10上轻量级视觉架构发现测试中,最佳架构自18.91%准确率的根节点基线进化至94.11%,并泛化至CIFAR-100和Tiny-ImageNet。进一步通过MedMNIST独立架构发现实验验证了其在专业领域的适用性,取得当前最优性能。实验表明:假设置信度分数随证据积累呈现递增预测性,且学习到的设计原理可跨独立进化谱系迁移——这表明HypoExplore不仅能发现更强架构,更能帮助构建对设计空间的本质理解。