Neural synchrony is hypothesized to play a crucial role in how the brain organizes visual scenes into structured representations, enabling the robust encoding of multiple objects within a scene. However, current deep learning models often struggle with object binding, limiting their ability to represent multiple objects effectively. Inspired by neuroscience, we investigate whether synchrony-based mechanisms can enhance object encoding in artificial models trained for visual categorization. Specifically, we combine complex-valued representations with Kuramoto dynamics to promote phase alignment, facilitating the grouping of features belonging to the same object. We evaluate two architectures employing synchrony: a feedforward model and a recurrent model with feedback connections to refine phase synchronization using top-down information. Both models outperform their real-valued counterparts and complex-valued models without Kuramoto synchronization on tasks involving multi-object images, such as overlapping handwritten digits, noisy inputs, and out-of-distribution transformations. Our findings highlight the potential of synchrony-driven mechanisms to enhance deep learning models, improving their performance, robustness, and generalization in complex visual categorization tasks.
翻译:神经同步机制被假设在大脑将视觉场景组织为结构化表征的过程中起着关键作用,使其能够对场景中的多个物体进行鲁棒编码。然而,当前的深度学习模型在处理物体绑定问题上常常面临困难,这限制了它们有效表征多个物体的能力。受神经科学的启发,我们研究了基于同步的机制是否能够增强为视觉分类任务训练的人工模型中的物体编码能力。具体而言,我们将复值表示与Kuramoto动力学相结合,以促进相位对齐,从而有助于将属于同一物体的特征进行分组。我们评估了两种采用同步机制的架构:一种前馈模型,以及一种利用反馈连接通过自上而下信息来细化相位同步的循环模型。在处理多物体图像的任务中(例如重叠手写数字、含噪声输入以及分布外变换),两种模型的表现均优于其实数值对应模型以及未采用Kuramoto同步的复值模型。我们的研究结果突显了同步驱动机制在增强深度学习模型方面的潜力,能够提升其在复杂视觉分类任务中的性能、鲁棒性和泛化能力。