Room impulse response (RIR), which measures the sound propagation within an environment, is critical for synthesizing high-fidelity audio for a given environment. Some prior work has proposed representing RIR as a neural field function of the sound emitter and receiver positions. However, these methods do not sufficiently consider the acoustic properties of an audio scene, leading to unsatisfactory performance. This letter proposes a novel Neural Acoustic Context Field approach, called NACF, to parameterize an audio scene by leveraging multiple acoustic contexts, such as geometry, material property, and spatial information. Driven by the unique properties of RIR, i.e., temporal un-smoothness and monotonic energy attenuation, we design a temporal correlation module and multi-scale energy decay criterion. Experimental results show that NACF outperforms existing field-based methods by a notable margin. Please visit our project page for more qualitative results.
翻译:房间脉冲响应(RIR)用于测量环境中的声音传播,对于在特定环境中合成高保真音频至关重要。此前的一些研究提出将RIR表示为声音发射器和接收器位置的神经场函数。然而,这些方法未能充分考虑音频场景的声学特性,导致性能不佳。本文提出一种新颖的神经声学上下文场方法(简称NACF),通过利用多种声学上下文(如几何形状、材质属性和空间信息)对音频场景进行参数化。基于RIR的独特特性(即时间非平滑性和单调能量衰减),我们设计了时间相关模块和多尺度能量衰减准则。实验结果表明,NACF在性能上显著优于现有基于场的方法。更多定性结果请访问我们的项目页面。