Synthesizing novel views from a single view image is a highly ill-posed problem. We discover an effective solution to reduce the learning ambiguity by expanding the single-view view synthesis problem to a multi-view setting. Specifically, we leverage the reliable and explicit stereo prior to generate a pseudo-stereo viewpoint, which serves as an auxiliary input to construct the 3D space. In this way, the challenging novel view synthesis process is decoupled into two simpler problems of stereo synthesis and 3D reconstruction. In order to synthesize a structurally correct and detail-preserved stereo image, we propose a self-rectified stereo synthesis to amend erroneous regions in an identify-rectify manner. Hard-to-train and incorrect warping samples are first discovered by two strategies, 1) pruning the network to reveal low-confident predictions; and 2) bidirectionally matching between stereo images to allow the discovery of improper mapping. These regions are then inpainted to form the final pseudo-stereo. With the aid of this extra input, a preferable 3D reconstruction can be easily obtained, and our method can work with arbitrary 3D representations. Extensive experiments show that our method outperforms state-of-the-art single-view view synthesis methods and stereo synthesis methods.
翻译:从单张图像合成新视角是一个高度病态的问题。我们发现了一种有效解法,通过将单视图视角扩展为多视图设置来减少学习歧义。具体而言,我们利用可靠且明确的立体先验生成伪立体视角作为辅助输入以构建三维空间。通过这种方式,具有挑战性的新视角合成被解耦为立体合成与三维重建两个更简单的问题。为合成结构正确且细节保留的立体图像,我们提出自矫正立体合成方法,以"识别-矫正"模式修正错误区域。通过两种策略发现难以训练和错误扭曲的样本:1) 修剪网络以揭示低置信度预测;2) 在立体图像间双向匹配以发现不恰当映射。这些区域随后被修复以形成最终伪立体图像。借助此额外输入,可轻松获得理想的三维重建结果,且该方法可适用于任意三维表示。大量实验证明,我们的方法在单视图视角合成与立体合成任务中均优于现有最优方法。