Despite the remarkable strides made in artificial intelligence, current object recognition models still lag behind in emulating the mechanism of visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often reply on invasive neural recordings from non-human subjects, leaving a critical gap in our understanding of human visual perception and the development of more human brain-like vision models. Addressing this gap, we present, for the first time, "Re(presentational)Al(ignment)net", a vision model aligned with human brain activity based on non-invasive EEG recordings, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding alignment framework not only optimizes multiple layers of the model, marking a substantial leap in neural alignment, but also enables the model to efficiently learn and mimic human brain's visual representational patterns across object categories and different neural data modalities. Furthermore, we discover that alignment with human brain representations improves the model's adversarial robustness. Our findings suggest that ReAlnet sets a new precedent in the field, bridging the gap between artificial and human vision, and paving the way for more brain-like artificial intelligence systems.
翻译:尽管人工智能取得了显著进展,当前的目标识别模型在模拟人脑视觉信息处理机制方面仍存在差距。近期研究强调了利用神经数据模仿大脑处理的潜力,但这些研究通常依赖非人类受试者的侵入性神经记录,导致人类视觉感知及更类人脑视觉模型开发的关键认知空白。为弥补这一空白,我们首次提出"Re(presentational)Al(ignment)net"——一种基于非侵入性脑电图记录、与人类大脑活动对齐的视觉模型,该模型展现出与人类大脑表征的高度相似性。我们创新的图像到大脑多层编码对齐框架不仅优化了模型的多个层级,实现了神经对齐的重大突破,还能使模型高效学习并模仿人类大脑在物体类别及不同神经数据模态下的视觉表征模式。此外,我们发现与人类大脑表征的对齐提升了模型的对抗鲁棒性。研究结果表明,ReAlnet为该领域树立了新典范,弥合了人工视觉与人脑视觉之间的鸿沟,并为构建更类脑的人工智能系统铺平了道路。