Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also propose a new task of Compositional Image Generation, which can transfer learned cut styles to different objects, by generating novel object-state images. Moreover, we also use the videos for Compositional Action Recognition, and show valuable uses of this dataset for multiple video tasks. Project website: https://chopnlearn.github.io.
翻译:识别与生成物体-状态组合一直是一项具有挑战性的任务,尤其是在泛化到未见过的组合时。本文研究了不同风格切割物体及其导致的物体状态变化。我们提出了一个新的基准套件 Chop & Learn,旨在通过多视角学习满足物体及不同切割风格的需求。同时,我们提出了一个新的任务——组合图像生成(Compositional Image Generation),该任务能够将学到的切割风格迁移至不同物体,通过生成新颖的物体-状态图像。此外,我们还利用视频进行组合动作识别(Compositional Action Recognition),展示了该数据集在多项视频任务中的重要应用价值。项目网站:https://chopnlearn.github.io。