We introduce MABe22, a large-scale, multi-agent video and trajectory benchmark to assess the quality of learned behavior representations. This dataset is collected from a variety of biology experiments, and includes triplets of interacting mice (4.7 million frames video+pose tracking data, 10 million frames pose only), symbiotic beetle-ant interactions (10 million frames video data), and groups of interacting flies (4.4 million frames of pose tracking data). Accompanying these data, we introduce a panel of real-life downstream analysis tasks to assess the quality of learned representations by evaluating how well they preserve information about the experimental conditions (e.g. strain, time of day, optogenetic stimulation) and animal behavior. We test multiple state-of-the-art self-supervised video and trajectory representation learning methods to demonstrate the use of our benchmark, revealing that methods developed using human action datasets do not fully translate to animal datasets. We hope that our benchmark and dataset encourage a broader exploration of behavior representation learning methods across species and settings.
翻译:摘要:我们提出MABe22,一个大规模多智能体视频与轨迹基准,用于评估学习到的行为表征质量。该数据集来自多项生物学实验,包括三只互作小鼠(470万帧视频+姿态追踪数据,1000万帧纯姿态数据)、共生甲虫-蚂蚁互作(1000万帧视频数据)及多个果蝇群体(440万帧姿态追踪数据)。伴随这些数据,我们引入一组真实下游分析任务,通过评估表征对实验条件(如品系、昼夜节律、光遗传刺激)及动物行为信息的保留程度,度量学习表征的质量。我们测试了多种当前最优的自监督视频与轨迹表征学习方法以验证本基准的实用性,结果表明基于人类动作数据集开发的方法无法完全迁移至动物数据集。我们期待本基准与数据集能推动跨物种、跨场景的行为表征学习方法探索。