In time-critical eXtended reality (XR) scenarios where users must rapidly reorient their attention to hazards, alerts, or instructions while engaged in a primary task, spatial audio can provide an immediate directional cue without occupying visual bandwidth. However, such scenarios can afford only a brief auditory exposure, requiring users to interpret sound direction quickly and without extended listening or head-driven refinement. This paper reports a controlled exploratory study of rapid spatial-audio localization in XR. Using HRTF-rendered broadband stimuli presented from a semi-dense set of directions around the listener, we quantify how accurately users can infer coarse direction from brief audio alone. We further examine the effects of short-term visuo-auditory feedback training as a lightweight calibration mechanism. Our findings show that brief spatial cues can convey coarse directional information, and that even short calibration can improve users' perception of aural signals. While these results highlight the potential of spatial audio for rapid attention guidance, they also show that auditory cues alone may not provide sufficient precision for complex or high-stakes tasks, and that spatial audio may be most effective when complemented by other sensory modalities or visual cues, without relying on head-driven refinement. We leverage this study on spatial audio as a preliminary investigation into a first-stage attention-guidance channel for wearable XR (e.g., VR head-mounted displays and AR smart glasses), and provide design insights on stimulus selection and calibration for time-critical use.
翻译:在时间关键的扩展现实(XR)场景中,用户在执行主要任务时必须迅速将注意力重新定向到危险、警报或指令上,空间音频可在不占用视觉带宽的情况下提供即时方向性线索。然而,此类场景仅能提供短暂的听觉暴露,要求用户快速解读声音方向,无需长时间聆听或依赖头部运动进行精细调整。本文报告了一项关于XR中快速空间音频定位的受控探索性研究。我们使用通过HRTF渲染的宽带刺激,从听众周围半密集方向集合呈现,量化了用户仅从短暂音频中推断粗略方向的准确度。我们进一步探讨了短期视听反馈训练作为轻量级校准机制的效果。研究结果表明,短暂的空间线索能够传达粗略的方向信息,且即使短暂的校准也能改善用户对听觉信号的感知。这些结果虽突显了空间音频在快速注意力引导中的潜力,但也表明仅凭听觉线索可能无法为复杂或高风险任务提供足够精度,且空间音频在与其他感官模态或视觉线索互补且不依赖头部运动精细调整时效果最佳。我们利用这项空间音频研究作为可穿戴XR(如VR头戴式显示器及AR智能眼镜)中第一级注意力引导通道的初步探索,并为时间关键应用场景下的刺激选择与校准提供了设计洞见。