ImViD：增强VR沉浸感的沉浸式体视频数据集 (ImViD: Immersive Volumetric Videos for Enhanced VR Engagement)

User engagement is greatly enhanced by fully immersive multi-modal experiences that combine visual and auditory stimuli. Consequently, the next frontier in VR/AR technologies lies in immersive volumetric videos with complete scene capture, large 6-DoF interaction space, multi-modal feedback, and high resolution & frame-rate contents. To stimulate the reconstruction of immersive volumetric videos, we introduce ImViD, a multi-view, multi-modal dataset featuring complete space-oriented data capture and various indoor/outdoor scenarios. Our capture rig supports multi-view video-audio capture while on the move, a capability absent in existing datasets, significantly enhancing the completeness, flexibility, and efficiency of data capture. The captured multi-view videos (with synchronized audios) are in 5K resolution at 60FPS, lasting from 1-5 minutes, and include rich foreground-background elements, and complex dynamics. We benchmark existing methods using our dataset and establish a base pipeline for constructing immersive volumetric videos from multi-view audiovisual inputs for 6-DoF multi-modal immersive VR experiences. The benchmark and the reconstruction and interaction results demonstrate the effectiveness of our dataset and baseline method, which we believe will stimulate future research on immersive volumetric video production.

翻译：用户参与度可通过结合视觉与听觉刺激的完全沉浸式多模态体验得到显著提升。因此，VR/AR技术的下一个前沿在于具备完整场景捕捉、大范围六自由度交互空间、多模态反馈及高分辨率与高帧率内容的沉浸式体视频。为促进沉浸式体视频的重建研究，本文提出ImViD——一个具备完整空间导向数据采集能力并涵盖多样室内外场景的多视角多模态数据集。我们的采集设备支持移动过程中的多视角视频-音频同步采集，这一现有数据集缺乏的能力显著提升了数据采集的完整性、灵活性与效率。所采集的多视角视频（含同步音频）达到5K分辨率与60FPS帧率，时长1-5分钟，包含丰富的前景-背景元素及复杂动态场景。我们基于本数据集对现有方法进行基准测试，并建立了从多视角视听输入构建沉浸式体视频的基础流程，以支持六自由度多模态沉浸式VR体验。基准测试结果及重建与交互实验证明了本数据集与基线方法的有效性，我们相信这将推动沉浸式体视频生成领域的未来研究。

相关内容

关注 23

IEEE虚拟现实会议一直是展示虚拟现实(VR)广泛领域研究成果的主要国际场所，包括增强现实（AR），混合现实（MR）和3D用户界面中寻求高质量的原创论文。每篇论文应归类为主要涵盖研究，应用程序或系统，并使用以下准则进行分类：研究论文应描述有助于先进软件，硬件，算法，交互或人为因素发展的结果。应用论文应解释作者如何基于现有思想并将其应用到以新颖的方式解决有趣的问题。每篇论文都应包括对给定应用领域中VR/AR/MR使用成功的评估。官网地址：http://dblp.uni-trier.de/db/conf/vr/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日