We introduce CzechLynx, the first large-scale, open-access dataset for individual identification, pose estimation, and instance segmentation of the Eurasian lynx (Lynx lynx). CzechLynx contains 39,760 camera trap images annotated with segmentation masks, identity labels, and 20-point skeletons and covers 319 unique individuals across 15 years of systematic monitoring in two geographically distinct regions: southwest Bohemia and the Western Carpathians. In addition to the real camera trap data, we provide a large complementary set of photorealistic synthetic images and a Unity-based generation pipeline with diffusion-based text-to-texture modeling, capable of producing arbitrarily large amounts of synthetic data spanning diverse environments, poses, and coat-pattern variations. To enable systematic testing across realistic ecological scenarios, we define three complementary evaluation protocols: (i) geo-aware, (ii) time-aware open-set, and (iii) time-aware closed-set, covering cross-regional and long-term monitoring settings. With the provided resources, CzechLynx offers a unique, flexible benchmark for robust evaluation of computer vision and machine learning models across realistic ecological scenarios.
翻译:我们介绍了CzechLynx,这是首个用于欧亚猞猁(Lynx lynx)个体识别、姿态估计和实例分割的大规模开放访问数据集。CzechLynx包含39,760张带有分割掩码、身份标签和20点骨骼标注的相机陷阱图像,涵盖了两个地理上不同区域(西南波希米亚和西喀尔巴阡山脉)在15年系统监测中记录的319个独特个体。除了真实的相机陷阱数据外,我们还提供了一个大型互补的逼真合成图像集,以及一个基于Unity的生成流程,该流程采用基于扩散的文本到纹理建模技术,能够生成任意数量的涵盖多样环境、姿态和毛皮图案变化的合成数据。为了支持在真实生态场景下的系统测试,我们定义了三种互补的评估协议:(i)地理感知,(ii)时间感知开放集,和(iii)时间感知封闭集,覆盖跨区域和长期监测的设置。通过所提供的资源,CzechLynx为计算机视觉和机器学习模型在真实生态场景下的鲁棒评估提供了一个独特且灵活的基准。