In real-world acoustic scenarios, there often are multiple sound sources present in a room. These sources are situated in various locations and produce sounds that reach the listener from multiple directions. The presence of multiple sources in a room creates new challenges in estimating the room impulse response (RIR) as each source has a unique RIR, dependent on its location and orientation. Therefore, issues of determining which RIR should be predicted and how to predict it arise, when the input signal is a mixture of multiple reverberated sources. To address these, we propose a new task of predicting a "representative" RIR for a room in a multiple source environment and present a training method to achieve this goal. In contrast to the model trained in a single source environment, our method shows robust performance, regardless of the number of sources in the environment.
翻译:在实际声学场景中,房间内通常存在多个声源。这些声源位于不同位置,产生的声音从多个方向到达听者。多声源的存在为房间冲激响应(RIR)估计带来了新挑战,因为每个声源都有独特的RIR,其特性取决于声源的位置和朝向。因此,当输入信号为多个混响声源的混合信号时,需要解决"应预测哪个RIR"以及"如何预测该RIR"的问题。为此,我们提出一项新任务:在多源环境中预测房间的"代表性"RIR,并设计了一种实现该目标的训练方法。与在单源环境中训练的模型相比,无论环境中声源数量如何变化,我们的方法均展现出稳健的性能。