We present an algorithm that fully reverses the shoebox image source method (ISM), a popular and widely used room impulse response (RIR) simulator for cuboid rooms introduced by Allen and Berkley in 1979. More precisely, given a discrete multichannel RIR generated by the shoebox ISM for a microphone array of known geometry, the algorithm reliably recovers the 18 input parameters. These are the 3D source position, the 3 dimensions of the room, the 6-degrees-of-freedom room translation and orientation, and an absorption coefficient for each of the 6 room boundaries. The approach builds on a recently proposed gridless image source localization technique combined with new procedures for room axes recovery and first-order-reflection identification. Extensive simulated experiments reveal that near-exact recovery of all parameters is achieved for a 32-element, 8.4-cm-wide spherical microphone array and a sampling rate of 16~kHz using fully randomized input parameters within rooms of size 2X2X2 to 10X10X5 meters. Estimation errors decay towards zero when increasing the array size and sampling rate. The method is also shown to strongly outperform a known baseline, and its ability to extrapolate RIRs at new positions is demonstrated. Crucially, the approach is strictly limited to low-passed discrete RIRs simulated using the vanilla shoebox ISM. Nonetheless, it represents to our knowledge the first algorithmic demonstration that this difficult inverse problem is in-principle fully solvable over a wide range of configurations.
翻译:我们提出了一种能够完全逆向鞋盒镜像声源方法(ISM)的算法,该方法由Allen和Berkley于1979年提出,是一种广泛使用的长方体房间脉冲响应(RIR)模拟器。具体而言,给定由鞋盒ISM为已知几何结构的麦克风阵列生成的离散多通道RIR,该算法可靠地恢复18个输入参数,包括三维声源位置、房间的三维尺寸、六自由度房间平移与朝向,以及六个房间界面的吸声系数。该方法基于最近提出的无网格镜像声源定位技术,结合新的房间轴恢复和一阶反射识别流程。大量仿真实验表明,对于32单元、8.4厘米宽的球形麦克风阵列,在16 kHz采样率下,使用完全随机化的输入参数(房间尺寸从2×2×2米到10×10×5米),所有参数均可实现近乎精确的恢复。随着阵列尺寸和采样率的增加,估计误差逐渐趋近于零。该方法还显著优于已知基准方法,并展示了其在新位置外推RIR的能力。关键限制在于,该方法严格仅适用于使用标准鞋盒ISM生成的带限离散RIR。尽管如此,据我们所知,这是首次通过算法证明这一困难逆问题在广泛配置下原则上完全可解的演示。