In human cognition, the binding problem describes the open question of how the brain flexibly integrates diverse information into cohesive object representations. Analogously, in machine learning, there is a pursuit for models capable of strong generalization and reasoning by learning object-centric representations in an unsupervised manner. Drawing from neuroscientific theories, Rotating Features learn such representations by introducing vector-valued features that encapsulate object characteristics in their magnitudes and object affiliation in their orientations. The "$\chi$-binding" mechanism, embedded in every layer of the architecture, has been shown to be crucial, but remains poorly understood. In this paper, we propose an alternative "cosine binding" mechanism, which explicitly computes the alignment between features and adjusts weights accordingly, and we show that it achieves equivalent performance. This allows us to draw direct connections to self-attention and biological neural processes, and to shed light on the fundamental dynamics for object-centric representations to emerge in Rotating Features.
翻译:在人类认知中,绑定问题描述了大脑如何灵活地将不同信息整合成连贯的对象表征这一开放性问题。类似地,在机器学习中,研究者致力于通过无监督学习面向对象的表征,构建具备强泛化与推理能力的模型。受神经科学理论启发,“旋转特征”通过引入向量值特征来学习此类表征:其幅度编码对象特征,方向编码对象归属。嵌入于每一层架构中的“χ-绑定”机制被证明至关重要,但其内在机理仍不明确。本文提出一种替代性“余弦绑定”机制,该机制显式计算特征间的对齐程度并据此调整权重,实验表明其性能与原始机制相当。这一发现使我们能够建立与自注意力机制及生物神经过程的直接关联,从而揭示旋转特征中面向对象表征涌现的基础动力学机制。