This paper investigates the challenge of learning image manifolds, specifically pose manifolds, of 3D objects using limited training data. It proposes a DNN approach to manifold learning and for predicting images of objects for novel, continuous 3D rotations. The approach uses two distinct concepts: (1) Geometric Style-GAN (Geom-SGAN), which maps images to low-dimensional latent representations and maintains the (first-order) manifold geometry. That is, it seeks to preserve the pairwise distances between base points and their tangent spaces, and (2) uses Euler's elastica to smoothly interpolate between directed points (points + tangent directions) in the low-dimensional latent space. When mapped back to the larger image space, the resulting interpolations resemble videos of rotating objects. Extensive experiments establish the superiority of this framework in learning paths on rotation manifolds, both visually and quantitatively, relative to state-of-the-art GANs and VAEs.
翻译:本文研究了利用有限训练数据学习3D对象图像流形(特别是姿态流形)的挑战。提出了一种基于深度神经网络的流形学习方法,用于预测物体在新型、连续3D旋转下的图像。该方法采用了两个核心概念:(1)几何风格生成对抗网络(Geom-SGAN),将图像映射到低维潜在表示,并保持(一阶)流形几何结构,即试图保留基点之间的成对距离及其切空间;(2)利用欧拉弹性曲线在低维潜在空间中实现有向点(点+切线方向)之间的平滑插值。当映射回高维图像空间时,所得插值结果呈现为旋转物体的连续视频。大量实验表明,与当前最先进的生成对抗网络和变分自编码器相比,该框架在旋转流形上的路径学习方面,无论视觉质量还是量化指标均具有显著优越性。