Recovering the shape and appearance of real-world objects from natural 2D images is a long-standing and challenging inverse rendering problem. In this paper, we introduce a novel hybrid differentiable rendering method to efficiently reconstruct the 3D geometry and reflectance of a scene from multi-view images captured by conventional hand-held cameras. Our method follows an analysis-by-synthesis approach and consists of two phases. In the initialization phase, we use traditional SfM and MVS methods to reconstruct a virtual scene roughly matching the real scene. Then in the optimization phase, we adopt a hybrid approach to refine the geometry and reflectance, where the geometry is first optimized using an approximate differentiable rendering method, and the reflectance is optimized afterward using a physically-based differentiable rendering method. Our hybrid approach combines the efficiency of approximate methods with the high-quality results of physically-based methods. Extensive experiments on synthetic and real data demonstrate that our method can produce reconstructions with similar or higher quality than state-of-the-art methods while being more efficient.
翻译:从自然二维图像中恢复真实物体的形状和外观是一个长期且具有挑战性的逆渲染问题。本文提出了一种新颖的混合可微渲染方法,能够高效地从传统手持相机拍摄的多视角图像中重建场景的三维几何与反射属性。我们的方法遵循“分析-合成”框架,包含两个阶段:初始化阶段中,使用传统的SfM和MVS方法重建一个与真实场景大致匹配的虚拟场景;优化阶段中,采用混合方法对几何和反射属性进行精细化调整——首先利用近似可微渲染方法优化几何结构,随后采用基于物理的可微渲染方法优化反射属性。这种混合方法兼顾了近似方法的高效率与基于物理方法的高质量结果。在合成数据和真实数据上的大量实验表明,本方法能够在保持更高效率的同时,生成与现有最优方法质量相当或更优的重建结果。