Animal pose estimation has become a crucial area of research, but the scarcity of annotated data is a significant challenge in developing accurate models. Synthetic data has emerged as a promising alternative, but it frequently exhibits domain discrepancies with real data. Style transfer algorithms have been proposed to address this issue, but they suffer from insufficient spatial correspondence, leading to the loss of label information. In this work, we present a new approach called Synthetic Pose-aware Animal ControlNet (SPAC-Net), which incorporates ControlNet into the previously proposed Prior-Aware Synthetic animal data generation (PASyn) pipeline. We leverage the plausible pose data generated by the Variational Auto-Encoder (VAE)-based data generation pipeline as input for the ControlNet Holistically-nested Edge Detection (HED) boundary task model to generate synthetic data with pose labels that are closer to real data, making it possible to train a high-precision pose estimation network without the need for real data. In addition, we propose the Bi-ControlNet structure to separately detect the HED boundary of animals and backgrounds, improving the precision and stability of the generated data. Using the SPAC-Net pipeline, we generate synthetic zebra and rhino images and test them on the AP10K real dataset, demonstrating superior performance compared to using only real images or synthetic data generated by other methods. Our work demonstrates the potential for synthetic data to overcome the challenge of limited annotated data in animal pose estimation.
翻译:动物姿态估计已成为一个重要的研究领域,但标注数据的稀缺性是开发精确模型面临的重大挑战。合成数据作为一种有前景的替代方案崭露头角,但其常与真实数据存在领域差异。风格迁移算法被提出用于解决这一问题,然而这类方法因空间对应关系不足而导致标签信息丢失。在本研究中,我们提出一种名为合成姿态感知动物控制网络(SPAC-Net)的新方法,将ControlNet整合到先前提出的先验感知合成动物数据生成(PASyn)流水线中。我们利用基于变分自编码器(VAE)的数据生成流水线产生的合理姿态数据,作为ControlNet全嵌套边缘检测(HED)边界任务模型的输入,生成与真实数据更接近的带姿态标签的合成数据,从而无需真实数据即可训练高精度姿态估计网络。此外,我们提出双ControlNet结构分别检测动物和背景的HED边界,提升了生成数据的精度与稳定性。通过SPAC-Net流水线,我们生成合成斑马和犀牛图像,并在AP10K真实数据集上进行测试,其性能显著优于仅使用真实图像或其他方法生成的合成数据。本研究证明了合成数据在克服动物姿态估计中标注数据有限挑战方面的潜力。