NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation

As a preliminary work, NeRF-Det unifies the tasks of novel view synthesis and 3D perception, demonstrating that perceptual tasks can benefit from novel view synthesis methods like NeRF, significantly improving the performance of indoor multi-view 3D object detection. Using the geometry MLP of NeRF to direct the attention of detection head to crucial parts and incorporating self-supervised loss from novel view rendering contribute to the achieved improvement. To better leverage the notable advantages of the continuous representation through neural rendering in space, we introduce a novel 3D perception network structure, NeRF-DetS. The key component of NeRF-DetS is the Multi-level Sampling-Adaptive Network, making the sampling process adaptively from coarse to fine. Also, we propose a superior multi-view information fusion method, known as Multi-head Weighted Fusion. This fusion approach efficiently addresses the challenge of losing multi-view information when using arithmetic mean, while keeping low computational costs. NeRF-DetS outperforms competitive NeRF-Det on the ScanNetV2 dataset, by achieving +5.02% and +5.92% improvement in [email protected] and [email protected], respectively.

翻译：作为前期工作，NeRF-Det统一了新视角合成与3D感知任务，证明感知任务可受益于NeRF等新视角合成方法，显著提升室内多视角3D目标检测性能。通过利用NeRF的几何MLP将检测头注意力引导至关键区域，并引入新视角渲染的自监督损失，实现了性能提升。为进一步利用神经渲染在空间中连续表示的显著优势，我们提出新型3D感知网络结构NeRF-DetS。其核心组件为多层级采样自适应网络，使采样过程实现从粗到细的自适应调整。同时提出一种优越的多视角信息融合方法——多头加权融合。该融合方法在保持低计算成本的同时，有效解决了算术平均法导致的多视角信息丢失问题。在ScanNetV2数据集上，NeRF-DetS相比竞品NeRF-Det性能更优，[email protected]和[email protected]分别提升+5.02%和+5.92%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日