Modern reconstruction techniques can effectively model complex 3D scenes from sparse 2D views. However, automatically assessing the quality of novel views and identifying artifacts is challenging due to the lack of ground truth images and the limitations of no-reference image metrics in predicting detailed artifact maps. The absence of such quality metrics hinders accurate predictions of the quality of generated views and limits the adoption of post-processing techniques, such as inpainting, to enhance reconstruction quality. In this work, we propose a new no-reference metric, Puzzle Similarity, which is designed to localize artifacts in novel views. Our approach utilizes image patch statistics from the input views to establish a scene-specific distribution that is later used to identify poorly reconstructed regions in the novel views. We test and evaluate our method in the context of 3D reconstruction; to this end, we collected a novel dataset of human quality assessment in unseen reconstructed views. Through this dataset, we demonstrate that our method can not only successfully localize artifacts in novel views, correlating with human assessment, but do so without direct references. Surprisingly, our metric outperforms both no-reference metrics and popular full-reference image metrics. We can leverage our new metric to enhance applications like automatic image restoration, guided acquisition, or 3D reconstruction from sparse inputs.
翻译:现代重建技术能够从稀疏的二维视角有效建模复杂的三维场景。然而,由于缺乏真实图像以及无参考图像度量在预测详细伪影分布图方面的局限性,自动评估新视角的质量并识别伪影具有挑战性。此类质量度量的缺失阻碍了对生成视角质量的准确预测,并限制了采用后处理技术(例如修复)来提升重建质量。本研究提出一种新的无参考度量——拼图相似度,其设计目标在于定位新视角中的伪影。我们的方法利用输入视角的图像块统计信息建立场景特定的分布,该分布随后用于识别新视角中重建效果较差的区域。我们在三维重建的背景下测试并评估了该方法;为此,我们收集了一个关于人类对未见重建视角质量评估的新数据集。通过该数据集,我们证明我们的方法不仅能够成功定位新视角中的伪影(与人类评估结果相关),而且无需直接参考图像即可实现。令人惊讶的是,我们的度量在性能上超越了现有的无参考度量及流行的全参考图像度量。我们可以利用这一新度量来增强自动图像修复、引导式采集或稀疏输入三维重建等应用。