The frame rates of most 3D LIDAR sensors used in intelligent vehicles are substantially lower than current cameras installed in the same vehicle. This research suggests using a mono camera to virtually enhance the frame rate of LIDARs, allowing the more frequent monitoring of dynamic objects in the surroundings that move quickly. As a first step, dynamic object candidates are identified and tracked in the camera frames. Following that, the LIDAR measurement points of these items are found by clustering in the frustums of 2D bounding boxes. Projecting these to the camera and tracking them to the next camera frame can be used to create 3D-2D correspondences between different timesteps. These correspondences between the last LIDAR frame and the actual camera frame are used to solve the PnP (Perspective-n-Point) problem. Finally, the estimated transformations are applied to the previously measured points to generate virtual measurements. With the proposed estimation, if the ego movement is known, not just static object position can be determined at timesteps where camera measurement is available, but positions of dynamic objects as well. We achieve state-of-the-art performance on large public datasets in terms of accuracy and similarity to real measurements.
翻译:大多数智能车辆中使用的3D LIDAR传感器的帧率明显低于同一车辆中安装的当前摄像头。本研究建议使用单目摄像头来虚拟提升LIDAR的帧率,从而能够更频繁地监测周围环境中快速运动的动态物体。作为第一步,在摄像头帧中识别并跟踪动态物体候选。随后,通过在2D边界框的平截头体内进行聚类,找到这些物体的LIDAR测量点。将这些点投影到摄像头并跟踪至下一摄像头帧,可以建立不同时间步之间的3D-2D对应关系。利用最后一次LIDAR帧与当前摄像头帧之间的这些对应关系,求解PnP(透视n点)问题。最后,将估计的变换应用于先前测量的点,生成虚拟测量值。通过所提出的估计方法,若已知自车运动,不仅可以在有摄像头测量的时间步确定静态物体的位置,还能确定动态物体的位置。在大型公开数据集上,我们在准确性和与真实测量的相似度方面达到了最先进水平。