Bridging Implicit and Explicit Geometric Transformation for Single-Image View Synthesis

Creating novel views from a single image has achieved tremendous strides with advanced autoregressive models, as unseen regions have to be inferred from the visible scene contents. Although recent methods generate high-quality novel views, synthesizing with only one explicit or implicit 3D geometry has a trade-off between two objectives that we call the "seesaw" problem: 1) preserving reprojected contents and 2) completing realistic out-of-view regions. Also, autoregressive models require a considerable computational cost. In this paper, we propose a single-image view synthesis framework for mitigating the seesaw problem while utilizing an efficient non-autoregressive model. Motivated by the characteristics that explicit methods well preserve reprojected pixels and implicit methods complete realistic out-of-view regions, we introduce a loss function to complement two renderers. Our loss function promotes that explicit features improve the reprojected area of implicit features and implicit features improve the out-of-view area of explicit features. With the proposed architecture and loss function, we can alleviate the seesaw problem, outperforming autoregressive-based state-of-the-art methods and generating an image $\approx$100 times faster. We validate the efficiency and effectiveness of our method with experiments on RealEstate10K and ACID datasets.

翻译：从单张图像生成新视角的方法随着先进的回归模型取得了巨大进展，因为需要从可见场景内容推断不可见区域。尽管现有方法能生成高质量的新视角，但仅使用显式或隐式3D几何进行合成会在两个目标间产生权衡，我们称之为"跷跷板"问题：1) 保留重投影内容，2) 补全合理的视野外区域。此外，自回归模型需要大量计算成本。本文提出一种单图像视角合成框架，在利用高效非自回归模型的同时缓解跷跷板问题。受显式方法能良好保留重投影像素而隐式方法能补全真实感视野外区域这一特性启发，我们引入一种损失函数来补充两种渲染器。该损失函数促使显式特征改善隐式特征的重投影区域，同时隐式特征改善显式特征的视野外区域。通过所提出的架构和损失函数，我们能够缓解跷跷板问题，超越基于自回归的最先进方法，并实现约100倍的图像生成加速。我们在RealEstate10K和ACID数据集上的实验验证了该方法的效率与有效性。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日