Investigating how people perceive virtual reality videos in the wild (\ie, those captured by everyday users) is a crucial and challenging task in VR-related applications due to complex \textit{authentic} distortions localized in space and time. Existing panoramic video databases only consider synthetic distortions, assume fixed viewing conditions, and are limited in size. To overcome these shortcomings, we construct the VR Video Quality in the Wild (VRVQW) database, which is one of the first of its kind, and contains $502$ user-generated videos with diverse content and distortion characteristics. Based on VRVQW, we conduct a formal psychophysical experiment to record the scanpaths and perceived quality scores from $139$ participants under two different viewing conditions. We provide a thorough statistical analysis of the recorded data, observing significant impact of viewing conditions on both human scanpaths and perceived quality. Moreover, we develop an objective quality assessment model for VR videos based on pseudocylindrical representation and convolution. Results on the proposed VRVQW show that our method is superior to existing video quality assessment models, only underperforming viewport-based models that otherwise rely on human scanpaths for projection. Last, we explore the additional use of the VRVQW dataset to benchmark saliency detection techniques, highlighting the need for further research. We have made the database and code available at \url{https://github.com/limuhit/VR-Video-Quality-in-the-Wild}.
翻译:研究人们如何感知野外虚拟现实视频(即由日常用户拍摄的视频)是VR相关应用中一项关键且具有挑战性的任务,原因是这些视频中存在着时空局部的复杂“真实”失真。现有的全景视频数据库仅考虑合成失真、假设固定观看条件且规模有限。为克服这些不足,我们构建了VR野外视频质量数据库(VRVQW),这是同类首批数据库之一,包含502个用户生成视频,具有多样化的内容和失真特征。基于VRVQW,我们进行了一项正式的心理物理实验,在两种不同观看条件下记录了139名参与者的注视轨迹和感知质量评分。我们对记录数据进行了全面的统计分析,观察到观看条件对人类注视轨迹和感知质量均具有显著影响。此外,我们基于伪柱面表示和卷积开发了一种客观的VR视频质量评估模型。在提出的VRVQW上的结果表明,我们的方法优于现有视频质量评估模型,仅略逊于依赖人类注视轨迹进行投影的视口基模型。最后,我们探索了VRVQW数据集在基准测试显著性检测技术中的额外用途,凸显了进一步研究的必要性。数据库和代码已公开于\url{https://github.com/limuhit/VR-Video-Quality-in-the-Wild}。