Realistic sound simulation plays a critical role in many applications. A key element in sound simulation is the room impulse response (RIR), which characterizes how sound propagates within a given space. Recent studies have applied neural implicit methods to learn RIR using context information collected from the environment, such as scene images. However, these approaches do not effectively leverage explicit geometric information from the environment. To further exploit neural implicit models with direct geometric features, we present MiNAF, which queries a rough room mesh at given locations and extracts distance distributions as an explicit representation of local context. Our approach demonstrates that incorporating explicit local geometric features can better guide the model in generating more accurate RIR predictions. Through comparisons with conventional and state-of-the-art methods, we show that MiNAF performs competitively across various evaluation metrics.
翻译:逼真的声音模拟在许多应用中发挥着关键作用。声音模拟的核心要素之一是房间冲激响应(RIR),它描述了声音在特定空间内的传播方式。近期研究采用神经隐式方法,利用从环境中收集的上下文信息(如场景图像)来学习RIR。然而,这些方法未能有效利用环境中的显式几何信息。为了进一步利用具有直接几何特征的神经隐式模型,我们提出了MiNAF,其在给定位置查询粗糙的房间网格,并提取距离分布作为局部上下文的显式表示。我们的方法表明,融入显式局部几何特征能更好地引导模型生成更准确的RIR预测。通过与常规方法和最新技术的比较,我们展示了MiNAF在多种评估指标上具有竞争力的表现。