Object-centric learning (OCL) extracts the representation of objects with slots, offering an exceptional blend of flexibility and interpretability for abstracting low-level perceptual features. A widely adopted method within OCL is slot attention, which utilizes attention mechanisms to iteratively refine slot representations. However, a major drawback of most object-centric models, including slot attention, is their reliance on predefining the number of slots. This not only necessitates prior knowledge of the dataset but also overlooks the inherent variability in the number of objects present in each instance. To overcome this fundamental limitation, we present a novel complexity-aware object auto-encoder framework. Within this framework, we introduce an adaptive slot attention (AdaSlot) mechanism that dynamically determines the optimal number of slots based on the content of the data. This is achieved by proposing a discrete slot sampling module that is responsible for selecting an appropriate number of slots from a candidate list. Furthermore, we introduce a masked slot decoder that suppresses unselected slots during the decoding process. Our framework, tested extensively on object discovery tasks with various datasets, shows performance matching or exceeding top fixed-slot models. Moreover, our analysis substantiates that our method exhibits the capability to dynamically adapt the slot number according to each instance's complexity, offering the potential for further exploration in slot attention research. Project will be available at https://kfan21.github.io/AdaSlot/
翻译:物体中心学习(OCL)通过槽(slot)提取物体的表征,为抽象低级感知特征提供了灵活性与可解释性的卓越结合。槽注意力作为OCL中广泛采用的方法,利用注意力机制迭代优化槽表征。然而,包括槽注意力在内的大多数物体中心模型存在一个主要缺陷:其依赖预先设定的槽数量。这不仅需要数据集的先验知识,也忽略了每个实例中物体数量固有的可变性。为克服这一根本局限,我们提出了一种新颖的复杂度感知物体自编码器框架。在此框架内,我们引入了自适应槽注意力(AdaSlot)机制,能够根据数据内容动态确定最优槽数量。这是通过提出一个离散槽采样模块实现的,该模块负责从候选列表中选取合适数量的槽。此外,我们设计了掩码槽解码器,在解码过程中抑制未被选中的槽。我们的框架在多个数据集上的物体发现任务中进行了广泛测试,其性能达到或超越了顶尖的固定槽模型。进一步的分析证实,我们的方法能够根据每个实例的复杂度动态调整槽数量,为槽注意力研究的深入探索提供了可能。项目地址:https://kfan21.github.io/AdaSlot/