LiDAR Upsampling is a challenging task for the perception systems of robots and autonomous vehicles, due to the sparse and irregular structure of large-scale scene contexts. Recent works propose to solve this problem by converting LiDAR data from 3D Euclidean space into an image super-resolution problem in 2D image space. Although their methods can generate high-resolution range images with fine-grained details, the resulting 3D point clouds often blur out details and predict invalid points. In this paper, we propose TULIP, a new method to reconstruct high-resolution LiDAR point clouds from low-resolution LiDAR input. We also follow a range image-based approach but specifically modify the patch and window geometries of a Swin-Transformer-based network to better fit the characteristics of range images. We conducted several experiments on three public real-world and simulated datasets. TULIP outperforms state-of-the-art methods in all relevant metrics and generates robust and more realistic point clouds than prior works.
翻译:激光雷达上采样是机器人和自动驾驶车辆感知系统中的一项挑战性任务,原因在于大规模场景上下文的稀疏性和不规则结构。近期研究通过将激光雷达数据从3D欧氏空间转换为2D图像空间中的图像超分辨率问题来解决这一难题。尽管这些方法能够生成具有精细细节的高分辨率距离图像,但由此产生的3D点云往往模糊细节并预测出无效点。本文提出TULIP,一种从低分辨率激光雷达输入重建高分辨率激光雷达点云的新方法。我们同样采用基于距离图像的方法,但针对性地修改了基于Swin-Transformer网络的块与窗口几何结构,以更好地适配距离图像的特性。我们在三个公开的真实世界与模拟数据集上进行了多项实验。结果表明,TULIP在所有相关指标上均优于当前最优方法,并生成了比先前工作更鲁棒且更真实的点云。