For driverless train operation on mainline railways, several tasks need to be implemented by technical systems. One of the most challenging tasks is to monitor the train's driveway and its surroundings for potential obstacles due to long braking distances. Machine learning algorithms can be used to analyze data from vision sensors such as infrared (IR) and visual (RGB) cameras, lidars, and radars to detect objects. Such algorithms require large amounts of annotated data from objects in the rail environment that may pose potential obstacles, as well as rail-specific objects such as tracks or catenary poles, as training data. However, only very few datasets are publicly available and these available datasets typically involve only a limited number of sensors. Datasets and trained models from other domains, such as automotive, are useful but insufficient for object detection in the railway context. Therefore, this publication presents OSDaR23, a multi-sensor dataset of 21 sequences captured in Hamburg, Germany, in September 2021. The sensor setup consisted of multiple calibrated and synchronized IR/RGB cameras, lidars, a radar, and position and acceleration sensors front-mounted on a railway vehicle. In addition to raw data, the dataset contains 204091 polyline, polygonal, rectangle and cuboid annotations for 20 different object classes. This dataset can also be used for tasks going beyond collision prediction, which are listed in this paper.
翻译:为实现干线铁路的无人驾驶列车运行,需要由技术系统实施多项任务。其中最具挑战性的任务之一,是由于制动距离较长,需监测列车轨道及其周围环境中的潜在障碍物。机器学习算法可用于分析来自红外(IR)和可见光(RGB)相机、激光雷达、雷达等视觉传感器的数据以检测物体。此类算法需要大量已标注的铁路环境中可能构成潜在障碍物的物体数据,以及轨道、接触网支柱等铁路专用物体数据作为训练数据。然而,目前仅有极少数数据集公开,且这些数据集通常仅涉及有限的传感器类型。其他领域(如汽车领域)的数据集和预训练模型虽有参考价值,但不足以满足铁路场景下的物体检测需求。因此,本文提出OSDaR23——一个于2021年9月在德国汉堡采集、包含21个序列的多传感器数据集。传感器配置包括多个已标定且同步的IR/RGB相机、激光雷达、雷达,以及前装于铁路车辆上的位置与加速度传感器。除原始数据外,该数据集还包含20种不同物体类别的204091个折线、多边形、矩形及立方体标注。该数据集还可用于本文列出的碰撞预测以外的多种任务。