We address the challenge of sound propagation simulations in $3$D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver positions is intractable, making stimulating a sound field with moving sources impractical. To overcome this limitation, we propose using deep operator networks to approximate linear wave-equation operators. This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources, achieving millisecond-scale computations. By learning a compact surrogate model, we avoid the offline calculation and storage of impulse responses for all relevant source/listener pairs. Our experiments, including various complex scene geometries, show good agreement with reference solutions, with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our method signifies a paradigm shift as no prior machine learning approach has achieved precise predictions of complete wave fields within realistic domains. We anticipate that our findings will drive further exploration of deep neural operator methods, advancing research in immersive user experiences within virtual environments.
翻译:我们解决了三维虚拟房间中移动声源的声传播模拟挑战,该技术可应用于虚拟/增强现实、游戏音频和空间计算。波动方程的解能描述衍射和干涉等波动现象,但使用传统数值离散化方法对数百个声源和接收点位置进行模拟是难以实现的,这使得移动声源的声场激励变得不切实际。为突破这一局限,我们提出采用深度算子网络近似线性波动方程算子,从而实现真实感3D声学场景中移动声源的快速声传播预测,将计算时间缩短至毫秒级。通过学习紧凑的替代模型,我们避免了为所有相关声源/听者组合离线计算和存储脉冲响应。实验涵盖多种复杂场景几何结构,与参考解吻合良好,均方根误差在0.02 Pa至0.10 Pa之间。值得注意的是,本方法标志着范式转变——此前尚无机器学习方法能在真实域内实现完整波场的精确预测。我们预期这一成果将推动深度神经算子方法的深入探索,促进虚拟环境中沉浸式用户体验研究的发展。