Enabled by large annotated datasets, tracking and segmentation of objects in videos has made remarkable progress in recent years. Despite these advancements, algorithms still struggle under degraded conditions and during fast movements. Event cameras are novel sensors with high temporal resolution and high dynamic range that offer promising advantages to address these challenges. However, annotated data for developing learning-based mask-level tracking algorithms with events is not available. To this end, we introduce: ($i$) a new task termed \emph{space-time instance segmentation}, similar to video instance segmentation, whose goal is to segment instances throughout the entire duration of the sensor input (here, the input are quasi-continuous events and optionally aligned frames); and ($ii$) \emph{\dname}, a dataset for the new task, containing aligned grayscale frames and events. It includes annotated ground-truth labels (pixel-level instance segmentation masks) of a group of up to seven freely moving and interacting mice. We also provide two reference methods, which show that leveraging event data can consistently improve tracking performance, especially when used in combination with conventional cameras. The results highlight the potential of event-aided tracking in difficult scenarios. We hope our dataset opens the field of event-based video instance segmentation and enables the development of robust tracking algorithms for challenging conditions.\url{https://github.com/tub-rip/MouseSIS}
翻译:得益于大规模标注数据集,视频中的目标跟踪与分割近年来取得了显著进展。尽管有这些进步,算法在退化条件下以及快速运动期间仍然面临困难。事件相机是一种新型传感器,具有高时间分辨率和高动态范围,为解决这些挑战提供了有前景的优势。然而,目前尚缺乏用于开发基于学习的事件掩码级跟踪算法的标注数据。为此,我们引入:($i$)一项称为**时空实例分割**的新任务,类似于视频实例分割,其目标是在整个传感器输入(此处输入为准连续事件及可选的对齐帧)持续时间内分割实例;以及($ii$)**MouseSIS**,一个针对该新任务的数据集,包含对齐的灰度帧与事件。它包含多达七只自由活动并交互的小鼠的标注真实标签(像素级实例分割掩码)。我们还提供了两种参考方法,结果表明利用事件数据能够持续提升跟踪性能,尤其是在与传统相机结合使用时。这些结果凸显了事件辅助跟踪在困难场景中的潜力。我们希望我们的数据集能够开启基于事件的视频实例分割领域,并推动针对挑战性条件的鲁棒跟踪算法的开发。\url{https://github.com/tub-rip/MouseSIS}