Magnitude-Corrected and Time-Aligned Interpolation of Head-Related Transfer Functions

Head-related transfer functions (HRTFs) are essential for virtual acoustic realities, as they contain all cues for localizing sound sources in three-dimensional space. Acoustic measurements are one way to obtain high-quality HRTFs. To reduce measurement time, cost, and complexity of measurement systems, a promising approach is to capture only a few HRTFs on a sparse sampling grid and then upsample them to a dense HRTF set by interpolation. However, HRTF interpolation is challenging because small changes in source position can result in significant changes in the HRTF phase and magnitude response. Previous studies greatly improved the interpolation by time-aligning the HRTFs in preprocessing, but magnitude interpolation errors, especially in contralateral regions, remain a problem. Building upon the time-alignment approaches, we propose an additional post-interpolation magnitude correction derived from a frequency-smoothed HRTF representation. Employing all 96 individual simulated HRTF sets of the HUTUBS database, we show that the magnitude correction significantly reduces interpolation errors compared to state-of-the-art interpolation methods applying only time alignment. Our analysis shows that when upsampling very sparse HRTF sets, the subject-averaged magnitude error in the critical higher frequency range is up to 1.5 dB lower when averaged over all directions and even up to 4 dB lower in the contralateral region. As a result, the interaural level differences in the upsampled HRTFs are considerably improved. The proposed algorithm thus has the potential to further reduce the minimum number of HRTFs required for perceptually transparent interpolation.

翻译：头相关传输函数（HRTFs）对于虚拟听觉现实至关重要，因为它们包含在三维空间中定位声源的所有线索。声学测量是获取高质量HRTFs的一种途径。为降低测量时间、成本和系统复杂度，一种有前景的方法是在稀疏采样网格上仅捕获少量HRTFs，然后通过插值将其上采样为密集的HRTF集。然而，HRTF插值面临挑战：声源位置的微小变化可能导致HRTF相位和幅度响应的显著改变。以往研究通过在预处理中对HRTFs进行时间对齐大幅改善了插值效果，但幅度插值误差（尤其在对侧区域）仍存在问题。基于时间对齐方法，我们提出一种源自频率平滑HRTF表示的插值后幅度校正策略。利用HUTUBS数据库中全部96组个体模拟HRTF集，我们证明相较于仅应用时间对齐的现有最优插值方法，幅度校正显著降低了插值误差。分析表明：当对极稀疏HRTF集进行上采样时，在关键高频区域，全方向平均的受试者平均幅度误差降低高达1.5 dB，而对侧区域甚至降低达4 dB。因此，上采样后HRTFs的双耳声级差得到显著改善。所提算法有望进一步降低实现感知透明插值所需的最小HRTF数量。