This paper proposes a probabilistic motion prediction method for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image. While our method evaluates the task achievability by the Energy-Based Model (EBM), previous EBMs are not designed for evaluating the consistency between different domains (i.e., image and motion in our method). Our method seamlessly integrates the image and motion data into the image feature domain by spatially-aligned temporal encoding so that features are extracted along the motion trajectory projected onto the image. Furthermore, this paper also proposes a data-driven motion optimization method, Deep Motion Optimizer (DMO), that works with EBM for motion prediction. Different from previous gradient-based optimizers, our self-supervised DMO alleviates the difficulty of hyper-parameter tuning to avoid local minima. The effectiveness of the proposed method is demonstrated with a variety of experiments with similar SOTA methods.
翻译:本文提出了一种面向长时运动的概率性运动预测方法。该方法基于给定图像中观测到的初始状态预测运动,以完成特定任务。虽然我们的方法通过能量基模型(EBM)评估任务可达成性,但现有EBM方法并非为评估不同域(即本文中的图像与运动)之间的一致性而设计。本文通过空间对齐时序编码将图像与运动数据无缝整合至图像特征域,从而沿投影至图像的运动轨迹提取特征。此外,本文还提出了一种与EBM协同工作的数据驱动运动优化方法——深度运动优化器(Deep Motion Optimizer, DMO)。与现有基于梯度的优化器不同,我们的自监督DMO可缓解超参数调优困难,避免陷入局部最优。通过多项与同类最优方法的对比实验,验证了所提方法的有效性。