Despite advancements in artificial intelligence, object recognition models still lag behind in emulating visual information processing in human brains. Recent studies have highlighted the potential of using neural data to mimic brain processing; however, these often rely on invasive neural recordings from non-human subjects, leaving a critical gap in understanding human visual perception. Addressing this gap, we present, 'Re(presentational)Al(ignment)net', a vision model aligned with human brain activity based on non-invasive EEG, demonstrating a significantly higher similarity to human brain representations. Our innovative image-to-brain multi-layer encoding framework advances human neural alignment by optimizing multiple model layers and enabling the model to efficiently learn and mimic the human brain's visual representational patterns across object categories and different modalities. Our findings suggest that ReAlnets better align artificial neural networks with human brain representations, making it more similar to human brain processing than traditional computer vision models, which takes an important step toward bridging the gap between artificial and human vision and achieving more brain-like artificial intelligence systems.
翻译:尽管人工智能取得了进展,物体识别模型在模拟人脑视觉信息处理方面仍然落后。近期研究强调了利用神经数据模拟大脑处理的潜力;然而,这些研究通常依赖于非人类受试者的侵入式神经记录,在理解人类视觉感知方面留下了关键空白。为填补这一空白,我们提出了基于非侵入式脑电图与人类大脑活动对齐的视觉模型“表征对齐网络”,该模型展现出与人类大脑表征显著更高的相似性。我们创新的图像到大脑多层编码框架通过优化多个模型层,使模型能够高效学习并模拟人脑跨物体类别与多模态的视觉表征模式,从而推进了人类神经对齐研究。我们的研究结果表明,表征对齐网络能更好地将人工神经网络与人类大脑表征对齐,使其比传统计算机视觉模型更接近人脑处理机制,这为弥合人工视觉与人类视觉之间的差距、实现更类脑的人工智能系统迈出了重要一步。