Head-related transfer functions (HRTFs) are important for immersive audio, and their spatial interpolation has been studied to upsample finite measurements. Recently, neural fields (NFs) which map from sound source direction to HRTF have gained attention. Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to a finite impulse response (FIR) filter. We propose the neural infinite impulse response filter field (NIIRF) method that instead estimates the coefficients of cascaded IIR filters. IIR filters mimic the modal nature of HRTFs, thus needing fewer coefficients to approximate them well compared to FIR filters. We find that our method can match the performance of existing NF-based methods on multiple datasets, even outperforming them when measurements are sparse. We also explore approaches to personalize the NF to a subject and experimentally find low-rank adaptation to be effective.
翻译:头相关传递函数(HRTF)对沉浸式音频至关重要,其空间插值技术已被研究用于对有限测量数据进行上采样。近年来,将声源方向映射到HRTF的神经场(NF)方法引起了广泛关注。现有基于NF的方法主要关注从给定声源方向估计HRTF的幅度,并将该幅度转换为有限脉冲响应(FIR)滤波器。我们提出神经无限脉冲响应滤波器场(NIIRF)方法,该方法转而估计级联IIR滤波器的系数。IIR滤波器能够模拟HRTF的模态特性,因此与FIR滤波器相比,使用更少的系数即可良好逼近HRTF。我们发现,该方法在多个数据集上可与现有基于NF的方法性能相当,且在测量数据稀疏时表现更优。我们还探索了个性化NF以适配不同受试者的方法,并通过实验发现低秩自适应策略效果显著。