Numerous fields, such as ecology, biology, and neuroscience, use animal recordings to track and measure animal behaviour. Over time, a significant volume of such data has been produced, but some computer vision techniques cannot explore it due to the lack of annotations. To address this, we propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated empirical pose prior. Our proposal is based on a recent self-supervised method for estimating 2D human pose that uses single images and a set of unpaired typical 2D poses within a GAN framework. We adapt this method to the limb structure of the mouse and generate the empirical prior of 2D poses from a synthetic 3D mouse model, thereby avoiding manual annotation. In experiments on a new mouse video dataset, we evaluate the performance of the approach by comparing pose predictions to a manually obtained ground truth. We also compare predictions with those from a supervised state-of-the-art method for animal pose estimation. The latter evaluation indicates promising results despite the lack of paired training data. Finally, qualitative results using a dataset of horse images show the potential of the setting to adapt to other animal species.
翻译:许多领域(如生态学、生物学和神经科学)通过动物记录追踪和测量动物行为。长期以来,这类数据已积累大量规模,但由于缺乏标注,部分计算机视觉技术无法对其进行有效探索。针对这一问题,我们提出一种利用合成生成的实证姿态先验,从无标注图像中估计二维小鼠身体姿态的方法。该方法基于近期一种在GAN框架下使用单张图像和一组非配对典型二维姿态进行自监督人体姿态估计的技术,将其适配至小鼠肢体结构,并通过合成三维小鼠模型生成二维姿态的实证先验,从而避免人工标注。在新型小鼠视频数据集实验中,我们通过将姿态预测结果与人工标注的真实值进行对比,评估了该方法的表现。同时,我们将预测结果与当前最先进的监督式动物姿态估计方法进行对比。尽管缺乏配对训练数据,但后者评估显示出可观的效果。最后,基于马匹图像数据集的定性实验结果展示了该方法适配其他动物物种的潜力。