Video see-through (VST) technology aims to seamlessly blend virtual and physical worlds by reconstructing reality through cameras. While manufacturers promise perceptual fidelity, it remains unclear how close these systems are to replicating natural human vision across varying environmental conditions. In this work, we quantify the perceptual gap between the human eye and different popular VST headsets (Apple Vision Pro, Meta Quest 3, Quest Pro) using psychophysical measures of visual acuity, contrast sensitivity, and color vision. We show that despite hardware advancements, all tested VST systems fail to match the dynamic range and adaptability of the naked eye. While high-end devices approach human performance in ideal lighting, they exhibit significant degradation in low-light conditions, particularly in contrast sensitivity and acuity. Our results map the physiological limitations of digital reality reconstruction, establishing a specific perceptual gap that defines the roadmap for achieving indistinguishable VST experiences.
翻译:视频透视(VST)技术旨在通过摄像头重建现实,无缝融合虚拟与物理世界。尽管制造商承诺感知保真度,但这些系统在不同环境条件下复制自然人类视觉的接近程度仍不明确。在本研究中,我们使用视觉敏锐度、对比敏感度和色觉的心理物理学测量方法,量化了人眼与不同主流VST头戴设备(Apple Vision Pro、Meta Quest 3、Quest Pro)之间的感知差距。研究表明,尽管硬件不断进步,所有测试的VST系统均未能匹配裸眼的动态范围与适应能力。高端设备在理想光照条件下接近人类视觉表现,但在低光环境中表现出显著性能下降,尤其在对比敏感度和视觉敏锐度方面。我们的结果揭示了数字现实重建的生理学局限,确立了一个具体的感知差距,这为最终实现难以区分的VST体验制定了技术路线图。