We address the challenge of sound propagation simulations in 3D virtual rooms with moving sources, which have applications in virtual/augmented reality, game audio, and spatial computing. Solutions to the wave equation can describe wave phenomena such as diffraction and interference. However, simulating them using conventional numerical discretization methods with hundreds of source and receiver positions is intractable, making stimulating a sound field with moving sources impractical. To overcome this limitation, we propose using deep operator networks to approximate linear wave-equation operators. This enables the rapid prediction of sound propagation in realistic 3D acoustic scenes with moving sources, achieving millisecond-scale computations. By learning a compact surrogate model, we avoid the offline calculation and storage of impulse responses for all relevant source/listener pairs. Our experiments, including various complex scene geometries, show good agreement with reference solutions, with root mean squared errors ranging from 0.02 Pa to 0.10 Pa. Notably, our method signifies a paradigm shift as no prior machine learning approach has achieved precise predictions of complete wave fields within realistic domains. We anticipate that our findings will drive further exploration of deep neural operator methods, advancing research in immersive user experiences within virtual environments.$
翻译:我们解决了带有运动声源的三维虚拟房间中声传播模拟的挑战,该研究在虚拟/增强现实、游戏音频和空间计算等领域具有应用价值。波动方程的解能够描述衍射和干涉等波动现象。然而,使用传统数值离散方法对数百个声源和接收点位置进行模拟计算量极大,这使得运动声源声场的激励变得不可行。为突破这一局限,我们提出利用深度算子网络来逼近线性波动方程算子。该方法能够快速预测真实三维声学场景中运动声源的声传播,实现毫秒级计算。通过学习紧凑的替代模型,我们避免了为所有相关声源/听者组合离线计算和存储脉冲响应。包含多种复杂场景几何结构的实验结果表明,所提方法与参考解具有良好一致性,均方根误差范围为0.02帕至0.10帕。值得注意的是,本方法标志着范式转变——目前尚无机器学习方法能在真实域内实现完整波场的精确预测。我们期望该研究成果将推动深度神经算子方法的进一步探索,促进虚拟环境中沉浸式用户体验研究的发展。