Room geometry is important prior information for implementing realistic 3D audio rendering. For this reason, various room geometry inference (RGI) methods have been developed by utilizing the time-of-arrival (TOA) or time-difference-of-arrival (TDOA) information in room impulse responses (RIRs). However, the conventional RGI technique poses several assumptions, such as convex room shapes, the number of walls known in priori, and the visibility of first-order reflections. In this work, we introduce the RGI-Net which can estimate room geometries without the aforementioned assumptions. RGI-Net learns and exploits complex relationships between low-order and high-order reflections in RIRs and, thus, can estimate room shapes even when the shape is non-convex or first-order reflections are missing in the RIRs. RGI-Net includes the evaluation network that separately evaluates the presence probability of walls, so the geometry inference is possible without prior knowledge of the number of walls.
翻译:房间几何信息是实现逼真三维音频渲染的重要先验信息。为此,多种房间几何推断方法被提出,它们利用房间脉冲响应中的到达时间或到达时间差信息。然而,传统的RGI技术存在若干假设,例如房间形状为凸形、墙壁数量已知以及一阶反射可见。本工作提出的RGI-Net能够在不依赖上述假设的情况下估计房间几何。RGI-Net学习并利用RIR中低阶与高阶反射间的复杂关系,因此即使房间形状为非凸形或RIR中缺失一阶反射时,仍能估计房间形状。RGI-Net包含评估网络,可独立评估各面墙壁的存在概率,从而无需预先知晓墙壁数量即可进行几何推断。