LiDAR Upsampling is a challenging task for the perception systems of robots and autonomous vehicles, due to the sparse and irregular structure of large-scale scene contexts. Recent works propose to solve this problem by converting LiDAR data from 3D Euclidean space into an image super-resolution problem in 2D image space. Although their methods can generate high-resolution range images with fine-grained details, the resulting 3D point clouds often blur out details and predict invalid points. In this paper, we propose TULIP, a new method to reconstruct high-resolution LiDAR point clouds from low-resolution LiDAR input. We also follow a range image-based approach but specifically modify the patch and window geometries of a Swin-Transformer-based network to better fit the characteristics of range images. We conducted several experiments on three public real-world and simulated datasets. TULIP outperforms state-of-the-art methods in all relevant metrics and generates robust and more realistic point clouds than prior works.
翻译:激光雷达点云上采样是机器人和自动驾驶车辆感知系统面临的一项挑战性任务,原因在于大规模场景上下文具有稀疏且不规则的结构。近期研究提出将激光雷达数据从3D欧几里得空间转换到2D图像空间,从而将问题转化为图像超分辨率任务。尽管这类方法能够生成具有精细细节的高分辨率距离图像,但由此产生的3D点云往往模糊了细节并预测出无效点。本文提出TULIP——一种从低分辨率激光雷达输入重建高分辨率点云的新方法。我们同样采用基于距离图像的方法,但针对Swin-Transformer网络的补丁和窗口几何结构进行了专门修改,以更好地适配距离图像的特性。我们在三个公开的真实场景与仿真数据集上开展多项实验。TULIP在所有相关指标上均优于现有最先进方法,且生成的时空点云相比先前工作更为稳健、更接近真实场景。