Unsupervised Object Learning via Common Fate

Matthias Tangemann,Steffen Schneider,Julius von Kügelgen,Francesco Locatello,Peter Gehler,Thomas Brox,Matthias Kümmerer,Matthias Bethge,Bernhard Schölkopf

Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling. We decompose this problem into three easier subtasks, and provide candidate solutions for each of them. Inspired by the Common Fate Principle of Gestalt Psychology, we first extract (noisy) masks of moving objects via unsupervised motion segmentation. Second, generative models are trained on the masks of the background and the moving objects, respectively. Third, background and foreground models are combined in a conditional "dead leaves" scene model to sample novel scene configurations where occlusions and depth layering arise naturally. To evaluate the individual stages, we introduce the Fishbowl dataset positioned between complex real-world scenes and common object-centric benchmarks of simplistic objects. We show that our approach allows learning generative models that generalize beyond the occlusions present in the input videos, and represent scenes in a modular fashion that allows sampling plausible scenes outside the training distribution by permitting, for instance, object numbers or densities not observed in the training set.

翻译：从未贴标签的视频中学习基因化物体模型是一个长期存在的问题,是因果场景建模所需要的。我们将这一问题分解成三个比较容易的子任务, 并为其中每个任务提供候选解决方案。在Gestalt心理学共同归宿原则的启发下, 我们首先通过不受监督的动作分割提取( 噪音) 移动对象的遮罩。第二, 基因模型分别针对背景和移动对象的遮罩进行训练。第三, 背景和前景模型在有条件的“ 死叶” 场景模型中结合, 以抽样新颖的场景配置, 其中隐蔽和深度层自然产生。为了评估各个阶段, 我们引入了位于复杂的真实世界场景和简单对象的通用目标中心基准之间的Fishbowl数据集。我们展示了我们的方法可以学习超越输入视频视频中的封闭面外的基因化模型, 并以模块方式代表场景, 允许在培训分布之外取样合理场景, 例如, 允许在训练场景中不观察到对象数或密度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【UC伯克利】自监督视觉表示学习，356页ppt，Self-Supervised Visual Learning

专知会员服务

66+阅读 · 2021年1月10日