Boosting Object Representation Learning via Motion and Object Continuity

Recent unsupervised multi-object detection models have shown impressive performance improvements, largely attributed to novel architectural inductive biases. Unfortunately, they may produce suboptimal object encodings for downstream tasks. To overcome this, we propose to exploit object motion and continuity, i.e., objects do not pop in and out of existence. This is accomplished through two mechanisms: (i) providing priors on the location of objects through integration of optical flow, and (ii) a contrastive object continuity loss across consecutive image frames. Rather than developing an explicit deep architecture, the resulting Motion and Object Continuity (MOC) scheme can be instantiated using any baseline object detection model. Our results show large improvements in the performances of a SOTA model in terms of object discovery, convergence speed and overall latent object representations, particularly for playing Atari games. Overall, we show clear benefits of integrating motion and object continuity for downstream tasks, moving beyond object representation learning based only on reconstruction.

翻译：最近的无监督多物体检测模型在性能上取得了显著提升，这主要归功于新颖的架构归纳偏置。然而，这些模型可能为下游任务生成次优的物体编码。为解决这一问题，我们提出利用物体运动与连续性——即物体不会凭空出现或消失。这一目标通过两种机制实现：（i）通过集成光流提供物体位置先验，以及（ii）在连续图像帧间施加对比式物体连续性损失。无需开发显式的深层架构，所提出的运动与物体连续性（MOC）方案可基于任何基线物体检测模型实例化。实验结果表明，该方案在物体发现、收敛速度及潜在物体表征质量上显著提升了当前最优模型的性能，尤其在Atari游戏场景中表现突出。总体而言，我们证明了在重建之外的物体表征学习范式中，融合运动与物体连续性对下游任务具有明确优势。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日