In many real-world settings, image observations of freely rotating 3D rigid bodies, may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-informed neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to $\mathbf{SO}(3)$, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions.
翻译:在许多实际场景中,当低维测量数据不可用时,可能仍可获得自由旋转三维刚体的图像观测。然而,图像数据的高维性排除了使用经典估计技术学习动力学的可能性。标准深度学习方法的实用性也受到限制,因为刚体图像无法揭示其内部质量分布——而质量分布连同初始角速度正是决定刚体旋转方式的关键因素。我们提出了一种物理信息神经网络模型,用于从图像序列估计和预测三维旋转动力学。该方法采用多阶段预测流水线:将单张图像映射到与$\mathbf{SO}(3)$同胚的隐空间表征,从隐表征对计算角速度,并利用哈密顿运动方程预测未来隐状态。我们在包含立方体、棱柱和卫星等旋转物体(具有未知均匀与非均匀质量分布)的合成图像序列新数据集上,验证了该方法的有效性。