In many real-world settings, image observations of freely rotating 3D rigid bodies, may be available when low-dimensional measurements are not. However, the high-dimensionality of image data precludes the use of classical estimation techniques to learn the dynamics. The usefulness of standard deep learning methods is also limited because an image of a rigid body reveals nothing about the distribution of mass inside the body, which, together with initial angular velocity, is what determines how the body will rotate. We present a physics-informed neural network model to estimate and predict 3D rotational dynamics from image sequences. We achieve this using a multi-stage prediction pipeline that maps individual images to a latent representation homeomorphic to $\mathbf{SO}(3)$, computes angular velocities from latent pairs, and predicts future latent states using the Hamiltonian equations of motion. We demonstrate the efficacy of our approach on new rotating rigid-body datasets of sequences of synthetic images of rotating objects, including cubes, prisms and satellites, with unknown uniform and non-uniform mass distributions.
翻译:在许多实际场景中,当难以获取低维测量数据时,可能获得自由旋转的三维刚体的图像观测。然而,图像数据的高维特性阻碍了使用经典估计技术来学习动力学。标准深度学习方法的适用性也受到限制,因为刚体图像无法揭示物体内部质量分布信息,而质量分布与初始角速度共同决定了物体的旋转方式。我们提出了一种物理信息神经网络模型,用于从图像序列估计和预测三维旋转动力学。该方法采用多阶段预测流水线:将单张图像映射至与$\mathbf{SO}(3)$同胚的潜在表示,从潜在状态对中计算角速度,并利用哈密顿运动方程预测未来潜在状态。我们在包含立方体、棱柱和卫星等旋转物体合成图像序列的新型旋转刚体数据集上验证了该方法的有效性,这些物体具有未知的均匀与非均匀质量分布。