We present Layered Ray Intersections (LaRI), a fully supervised method for occluded geometry reasoning from a single image. Unlike conventional depth estimation, which is limited to visible surfaces, LaRI predicts multiple surfaces intersected by the camera rays using layered point maps. Compared to the existing approaches that leverage neural implicit representations or iterative refinement, LaRI achieves complete scene reconstruction in one feed-forward pass, enabling efficient and view-aligned geometric reasoning to underpin both object-level and scene-level tasks. We further propose to predict the ray stopping index, which identifies valid intersecting pixels and layers from LaRI's output. To better underpin and evaluate this task, we build an annotation pipeline using rendering engines, construct annotations for five public datasets, including synthetic and real-world data covering 3D objects and scenes. As a generic method, LaRI's performance is validated in object-level and scene-level reconstruction tasks.
翻译:本文提出分层射线交点法(LaRI),一种面向单图像遮挡几何推理的全监督方法。不同于局限于可见表面的传统深度估计,LaRI利用分层点图预测相机射线穿过的多个表面。相较于依赖神经隐式表示或迭代优化的现有方法,LaRI通过单次前馈推理实现完整场景重建,从而为物体级和场景级任务提供高效、视角对齐的几何推理支持。我们进一步提出射线终止索引预测方法,用于从LaRI输出中识别有效相交像素与层。为更好支撑与评估该任务,我们构建基于渲染引擎的标注流程,为五个涵盖合成与真实数据的公开数据集(包括3D物体与场景)创建了标注。作为通用方法,LaRI在物体级与场景级重建任务中的有效性得到了验证。