The concept of metamerism originates from colorimetry, where it describes a sensation of visual similarity between two colored lights despite significant differences in spectral content. Likewise, we propose to call ``musical metamerism'' the sensation of auditory similarity which is elicited by two music fragments which differ in terms of underlying waveforms. In this technical report, we describe a method to generate musical metamers from any audio recording. Our method is based on joint time--frequency scattering in Kymatio, an open-source software in Python which enables GPU computing and automatic differentiation. The advantage of our method is that it does not require any manual preprocessing, such as transcription, beat tracking, or source separation. We provide a mathematical description of JTFS as well as some excerpts from the Kymatio source code. Lastly, we review the prior work on JTFS and draw connections with closely related algorithms, such as spectrotemporal receptive fields (STRF), modulation power spectra (MPS), and Gabor filterbank (GBFB).
翻译:同色异谱概念源于色度学,用于描述两种光谱成分显著不同但引发相似视觉感知的色光。类似地,我们提出将"音乐同色异谱"定义为:由波形构成不同但引发相似听觉感知的两个音乐片段所产生的现象。本技术报告描述了一种从任意音频录音生成音乐同色异谱的方法。该方法基于Kymatio开源Python软件中的联合时频散射变换,该框架支持GPU计算与自动微分。本方法的优势在于无需任何人工预处理(如乐谱转录、节拍跟踪或源分离)。我们提供了JTFS的数学描述及Kymatio源代码片段。最后,我们回顾了JTFS的现有研究,并将其与谱时感受野、调制功率谱、Gabor滤波器组等相关算法进行了关联分析。