Our work introduces the YCB-Ev dataset, which contains synchronized RGB-D frames and event data that enables evaluating 6DoF object pose estimation algorithms using these modalities. This dataset provides ground truth 6DoF object poses for the same 21 YCB objects that were used in the YCB-Video (YCB-V) dataset, allowing for cross-dataset algorithm performance evaluation. The dataset consists of 21 synchronized event and RGB-D sequences, totalling 13,851 frames (7 minutes and 43 seconds of event data). Notably, 12 of these sequences feature the same object arrangement as the YCB-V subset used in the BOP challenge. Ground truth poses are generated by detecting objects in the RGB-D frames, interpolating the poses to align with the event timestamps, and then transferring them to the event coordinate frame using extrinsic calibration. Our dataset is the first to provide ground truth 6DoF pose data for event streams. Furthermore, we evaluate the generalization capabilities of two state-of-the-art algorithms, which were pre-trained for the BOP challenge, using our novel YCB-V sequences. The dataset is publicly available at https://github.com/paroj/ycbev.
翻译:本研究介绍了YCB-Ev数据集,该数据集包含同步的RGB-D帧与事件数据,可用于评估基于这些模态的六自由度物体姿态估计算法。本数据集为与YCB-Video数据集相同的21个YCB物体提供了真实六自由度姿态标注,支持跨数据集的算法性能评估。数据集包含21段同步的事件与RGB-D序列,总计13,851帧(对应7分43秒的事件数据)。值得注意的是,其中12段序列采用了与BOP挑战赛所用YCB-V子集相同的物体摆放布局。真实姿态通过以下流程生成:在RGB-D帧中检测物体,将姿态插值对齐至事件时间戳,再通过外参标定转换至事件坐标系。本数据集是首个为事件流提供真实六自由度姿态数据的数据集。此外,我们使用新颖的YCB-V序列评估了两种为BOP挑战赛预训练的前沿算法的泛化能力。数据集已公开于https://github.com/paroj/ycbev。