Robust road surface estimation is required for autonomous ground vehicles to navigate safely. Despite it becoming one of the main targets for autonomous mobility researchers in recent years, it is still an open problem in which cameras and LiDAR sensors have demonstrated to be adequate to predict the position, size and shape of the road a vehicle is driving on in different environments. In this work, a novel Convolutional Neural Network model is proposed for the accurate estimation of the roadway surface. Furthermore, an ablation study has been conducted to investigate how different encoding strategies affect model performance, testing 6 slightly different neural network architectures. Our model is based on the use of a Twin Encoder-Decoder Neural Network (TEDNet) for independent camera and LiDAR feature extraction, and has been trained and evaluated on the Kitti-Road dataset. Bird's Eye View projections of the camera and LiDAR data are used in this model to perform semantic segmentation on whether each pixel belongs to the road surface. The proposed method performs among other state-of-the-art methods and operates at the same frame-rate as the LiDAR and cameras, so it is adequate for its use in real-time applications.
翻译:鲁棒的道路表面估计对于自动驾驶车辆的安全导航至关重要。尽管近年来该问题已成为自主移动研究的主要目标之一,但仍是一个开放性难题——摄像头与激光雷达传感器已被证明能够在不同环境下有效预测车辆行驶道路的位置、尺寸及形状。本文提出了一种新型卷积神经网络模型,用于精确估计道路表面。此外,通过消融研究探究了不同编码策略对模型性能的影响,测试了六种略有差异的神经网络架构。本模型基于双编码器-解码器神经网络(TEDNet)实现摄像头与激光雷达特征的独立提取,并在Kitti-Road数据集上完成训练与评估。该模型利用摄像头与激光雷达数据的鸟瞰图投影,对每个像素是否属于道路表面进行语义分割。所提方法在性能上与其他先进方法相当,且运行帧率与激光雷达及摄像头保持一致,因此适用于实时应用场景。