Generalizable NeRF can directly synthesize novel views across new scenes, eliminating the need for scene-specific retraining in vanilla NeRF. A critical enabling factor in these approaches is the extraction of a generalizable 3D representation by aggregating source-view features. In this paper, we propose an Entangled View-Epipolar Information Aggregation method dubbed EVE-NeRF. Different from existing methods that consider cross-view and along-epipolar information independently, EVE-NeRF conducts the view-epipolar feature aggregation in an entangled manner by injecting the scene-invariant appearance continuity and geometry consistency priors to the aggregation process. Our approach effectively mitigates the potential lack of inherent geometric and appearance constraint resulting from one-dimensional interactions, thus further boosting the 3D representation generalizablity. EVE-NeRF attains state-of-the-art performance across various evaluation scenarios. Extensive experiments demonstate that, compared to prevailing single-dimensional aggregation, the entangled network excels in the accuracy of 3D scene geometry and appearance reconstruction.Our project page is https://github.com/tatakai1/EVENeRF.
翻译:可泛化NeRF能够直接跨新场景合成新视角,无需像原始NeRF那样针对特定场景重新训练。这类方法的关键实现因素在于通过聚合源视图特征提取可泛化的三维表示。本文提出了一种名为EVE-NeRF的纠缠视图-极线信息聚合方法。与现有将跨视图信息与沿极线信息独立处理的方法不同,EVE-NeRF通过将场景不变的外观连续性与几何一致性先验注入聚合过程,以纠缠方式实现视图-极线特征聚合。该方法有效缓解了因单一维度交互导致的内在几何与外观约束缺失问题,从而进一步提升三维表示的可泛化性。EVE-NeRF在各种评估场景中均达到了最优性能。大量实验表明,与当前主流的单维度聚合相比,纠缠网络在三维场景几何与外观重建的精度上表现更优。项目主页:https://github.com/tatakai1/EVENeRF。