High-quality human reconstruction and photo-realistic rendering of a dynamic scene is a long-standing problem in computer vision and graphics. Despite considerable efforts invested in developing various capture systems and reconstruction algorithms, recent advancements still struggle with loose or oversized clothing and overly complex poses. In part, this is due to the challenges of acquiring high-quality human datasets. To facilitate the development of these fields, in this paper, we present PKU-DyMVHumans, a versatile human-centric dataset for high-fidelity reconstruction and rendering of dynamic human scenarios from dense multi-view videos. It comprises 8.2 million frames captured by more than 56 synchronized cameras across diverse scenarios. These sequences comprise 32 human subjects across 45 different scenarios, each with a high-detailed appearance and realistic human motion. Inspired by recent advancements in neural radiance field (NeRF)-based scene representations, we carefully set up an off-the-shelf framework that is easy to provide those state-of-the-art NeRF-based implementations and benchmark on PKU-DyMVHumans dataset. It is paving the way for various applications like fine-grained foreground/background decomposition, high-quality human reconstruction and photo-realistic novel view synthesis of a dynamic scene. Extensive studies are performed on the benchmark, demonstrating new observations and challenges that emerge from using such high-fidelity dynamic data.
翻译:高质量的人体重建以及动态场景的照片级真实感渲染是计算机视觉与图形学领域长期存在的难题。尽管研究者们在开发各类采集系统与重建算法上投入了大量精力,但现有方法在处理宽松/超大尺寸衣物及过于复杂姿态时仍面临挑战——这在一定程度上源于高质量人体数据集获取的困难。为促进相关领域发展,本文提出PKU-DyMVHumans数据集——一个面向密集多视角视频动态场景高保真重建与渲染的通用人体中心数据集。该数据集由56台同步摄像机在45种不同场景中采集的820万帧影像构成,包含32名高细节外观与真实人体运动的人体对象。受基于神经辐射场(NeRF)的场景表示最新进展启发,我们精心构建了即用型基准框架,可便捷地提供当前最先进的NeRF类实现并在PKU-DyMVHumans数据集上进行评测。该工作为细粒度前景/背景分解、高质量人体重建及动态场景照片级真实感新视角合成等应用铺平了道路。基于该基准的大量实验揭示了高保真动态数据使用中涌现的新观测现象与挑战。