Realistic audio synthesis that captures accurate acoustic phenomena is essential for creating immersive experiences in virtual and augmented reality. Synthesizing the sound received at any position relies on the estimation of impulse response (IR), which characterizes how sound propagates in one scene along different paths before arriving at the listener's position. In this paper, we present Acoustic Volume Rendering (AVR), a novel approach that adapts volume rendering techniques to model acoustic impulse responses. While volume rendering has been successful in modeling radiance fields for images and neural scene representations, IRs present unique challenges as time-series signals. To address these challenges, we introduce frequency-domain volume rendering and use spherical integration to fit the IR measurements. Our method constructs an impulse response field that inherently encodes wave propagation principles and achieves state-of-the-art performance in synthesizing impulse responses for novel poses. Experiments show that AVR surpasses current leading methods by a substantial margin. Additionally, we develop an acoustic simulation platform, AcoustiX, which provides more accurate and realistic IR simulations than existing simulators. Code for AVR and AcoustiX are available at https://zitonglan.github.io/avr.
翻译:在虚拟现实和增强现实中,能够捕捉精确声学现象的真实音频合成对于创造沉浸式体验至关重要。合成任意位置接收到的声音依赖于脉冲响应(IR)的估计,该响应表征了声音在到达听者位置前如何在场景中沿不同路径传播。本文提出声学体积渲染(AVR),这是一种将体积渲染技术应用于声学脉冲响应建模的新方法。尽管体积渲染在图像辐射场和神经场景表示建模方面已取得成功,但脉冲响应作为时间序列信号带来了独特的挑战。为应对这些挑战,我们引入了频域体积渲染,并利用球面积分来拟合脉冲响应测量值。我们的方法构建了一个本质上编码波传播原理的脉冲响应场,并在合成新位姿的脉冲响应方面实现了最先进的性能。实验表明,AVR以显著优势超越了当前的主流方法。此外,我们开发了一个声学仿真平台AcoustiX,它提供了比现有仿真器更准确、更真实的脉冲响应仿真。AVR和AcoustiX的代码可在 https://zitonglan.github.io/avr 获取。