WildDepth: A Multimodal Dataset for 3D Wildlife Perception and Depth Estimation

Depth estimation and 3D reconstruction have been extensively studied as core topics in computer vision. Starting from rigid objects with relatively simple geometric shapes, such as vehicles, the research has expanded to address general objects, including challenging deformable objects, such as humans and animals. However, for the animal, in particular, the majority of existing models are trained based on datasets without metric scale, which can help validate image-only models. To address this limitation, we present WildDepth, a multimodal dataset and benchmark suite for depth estimation, behavior detection, and 3D reconstruction from diverse categories of animals ranging from domestic to wild environments with synchronized RGB and LiDAR. Experimental results show that the use of multi-modal data improves depth reliability by up to 10% RMSE, while RGB-LiDAR fusion enhances 3D reconstruction fidelity by 12% in Chamfer distance. By releasing WildDepth and its benchmarks, we aim to foster robust multimodal perception systems that generalize across domains.

翻译：深度估计与三维重建作为计算机视觉的核心课题已得到广泛研究。从具有相对简单几何形状的刚性物体（如车辆）起步，研究已扩展到处理包括具有挑战性的可变形物体（如人类与动物）在内的通用对象。然而，针对动物这一特定对象，现有模型大多基于无量纲数据集进行训练，这类数据集虽有助于验证纯图像模型，却存在局限。为突破此限制，我们提出了WildDepth——一个面向深度估计、行为检测及三维重建的多模态数据集与基准测试套件，涵盖从家养到野外环境的多种动物类别，并提供同步的RGB与激光雷达数据。实验结果表明，多模态数据的使用可将深度估计的可靠性提升高达10%（以均方根误差计），而RGB-激光雷达融合技术则使三维重建的保真度在倒角距离指标上提升了12%。通过公开WildDepth数据集及其基准测试，我们旨在推动能够跨领域泛化的鲁棒多模态感知系统的发展。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

前馈式三维场景建模

专知会员服务

11+阅读 · 4月17日

迈向深度基础模型：基于视觉的深度估计最新趋势

专知会员服务

23+阅读 · 2025年7月16日

深度学习的多视角三维重建技术综述

专知会员服务

22+阅读 · 2025年6月7日

基于深度学习的物体姿态估计综述

专知会员服务

26+阅读 · 2024年5月15日