Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets. This progress is largely fueled by slot-based methods, whose ability to cluster visual scenes into meaningful objects holds great promise for compositional generalization and downstream learning. In these methods, the number of slots (clusters) $K$ is typically chosen to match the number of ground-truth objects in the data, even though this quantity is unknown in real-world settings. Indeed, the sensitivity of slot-based methods to $K$, and how this affects their learned correspondence to objects in the data has largely been ignored in the literature. In this work, we address this issue through a systematic study of slot-based methods. We propose using analogs to precision and recall based on the Adjusted Rand Index to accurately quantify model behavior over a large range of $K$. We find that, especially during training, incorrect choices of $K$ do not yield the desired object decomposition and, in fact, cause substantial oversegmentation or merging of separate objects (undersegmentation). We demonstrate that the choice of the objective function and incorporating instance-level annotations can moderately mitigate this behavior while still falling short of fully resolving this issue. Indeed, we show how this issue persists across multiple methods and datasets and stress its importance for future slot-based models.

翻译：用于学习对象中心表征的自监督方法近期已成功应用于多种数据集。这一进展主要得益于基于槽位的方法，其将视觉场景聚类为有意义对象的能力为组合泛化和下游学习带来了巨大潜力。在这些方法中，槽位（聚类）数量 $K$ 通常选择与数据中真实对象数量相匹配，尽管在真实场景中该量是未知的。事实上，文献中很大程度上忽视了基于槽位方法对 $K$ 的敏感性，以及这种敏感性如何影响其学习与数据中对象的对应关系。在本工作中，我们通过系统研究基于槽位的方法来解决这一问题。我们建议使用基于调整兰德指数的精确率和召回率类比，以准确量化模型在较大 $K$ 范围内的行为。我们发现，特别是在训练过程中，$K$ 的错误选择不会产生期望的对象分解，反而会导致严重的过分割或独立对象的合并（欠分割）。我们证明，目标函数的选择和引入实例级注释可以适度缓解这种行为，但仍无法完全解决这一问题。事实上，我们展示了该问题如何在多种方法和数据集中持续存在，并强调了其对未来基于槽位模型的重要性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日