Fisheye cameras are widely deployed in autonomous driving perception suites for their low cost and full-coverage field of view (FOV), yet their potential remains underleveraged in 3D object detection. Severe radial distortion challenges most BEV detectors by violating the fundamental assumption of uniform sampling. To bridge this gap, we propose Distortion-Aware PETR (DAPETR), a projection-free detector tailored for mixed pinhole-fisheye camera setups. DAPETR incorporates two key learned-adaptive modules: a unified distortion-aware positional embedding that harmonizes positional encodings for image representations with fisheye geometry, and a bidirectional feature-geometry co-modulation module that mutually adapts image features and 3D positional embeddings. In our experiments on a converted KITTI-360 benchmark, we systematically compare our learned adaptive approach against PETR in polar coordinates (PolarPETR). We find that while both methods improve over the baseline, our learned modules achieve superior performance. Crucially, we uncover a negative interaction when combining both strategies, revealing that learned adaptation and explicit geometric reparameterization can conflict. Our final DAPETR model significantly advances the research and benchmark for fisheye BEV detection, providing critical insights into effective distortion-aware 3D perception design other than image rectification.
翻译:鱼眼相机凭借低成本与全视场角优势广泛应用于自动驾驶感知系统,但其在三维目标检测中的潜力仍未充分挖掘。严重的径向畸变通过破坏均匀采样的基本假设,对大多数BEV检测器构成挑战。为弥合这一鸿沟,我们提出畸变感知PETR(DAPETR)——一种专为混合针孔-鱼眼相机配置设计的无投影检测器。DAPETR包含两个关键的可学习自适应模块:统一畸变感知位置嵌入模块,通过鱼眼几何特性协调图像表征的位置编码;以及双向特征-几何协同调制模块,实现图像特征与三维位置嵌入的相互适配。在转换后的KITTI-360基准实验中的系统对比表明,相较于极坐标PETR(PolarPETR),两种改进方法均优于基线模型,而我们的可学习模块取得了更优性能。关键发现是两种策略联用时存在负交互效应,揭示了可学习自适应与显式几何重参数化之间的潜在冲突。最终DAPETR模型显著推进了鱼眼BEV检测研究及基准测试,为超越图像校正的畸变感知三维感知设计提供了关键洞见。