With the advancement of computing resources, an increasing number of Neural Networks (NNs) are appearing for image detection and segmentation appear. However, these methods usually accept as input a RGB 2D image. On the other side, Light Detection And Ranging (LiDAR) sensors with many layers provide images that are similar to those obtained from a traditional low resolution RGB camera. Following this principle, a new dataset for segmenting cars in pseudo-RGB images has been generated. This dataset combines the information given by the LiDAR sensor into a Spherical Range Image (SRI), concretely the reflectivity, near infrared and signal intensity 2D images. These images are then fed into instance segmentation NNs. These NNs segment the cars that appear in these images, having as result a Bounding Box (BB) and mask precision of 88% and 81.5% respectively with You Only Look Once (YOLO)-v8 large. By using this segmentation NN, some trackers have been applied so as to follow each car segmented instance along a video feed, having great performance in real world experiments.
翻译:随着计算资源的进步,用于图像检测与分割的神经网络(NNs)日益增多。然而,这些方法通常以RGB二维图像作为输入。另一方面,具有多层的激光探测与测距(LiDAR)传感器提供的图像类似于传统低分辨率RGB相机所获取的图像。基于这一原理,我们生成了一个用于在伪RGB图像中分割车辆的新数据集。该数据集将LiDAR传感器提供的反射率、近红外和信号强度二维图像信息整合为球面距离图像(SRI)。这些图像随后被输入实例分割神经网络。这些网络对图像中出现的车辆进行分割,在使用YOLO-v8 large模型时,分别获得了88%的边界框(BB)精度和81.5%的掩码精度。通过应用该分割神经网络,我们进一步采用多种跟踪器对视频流中每个分割出的车辆实例进行持续追踪,在实际场景实验中表现出优异的性能。