Music source restoration (MSR) aims to recover unprocessed stems from mixed and mastered recordings. The challenge lies in both separating overlapping sources and reconstructing signals degraded by production effects such as compression and reverberation. We therefore propose DTT-BSR, a hybrid generative adversarial network (GAN) combining rotary positional embeddings (RoPE) transformer for long-term temporal modeling with dual-path band-split recurrent neural network (RNN) for multi-resolution spectral processing. Our model achieved 3rd place on the objective leaderboard and 4th place on the subjective leaderboard on the ICASSP 2026 MSR Challenge, demonstrating exceptional generation fidelity and semantic alignment with a compact size of 7.1M parameters.
翻译:音乐源修复(MSR)旨在从混合及母带处理后的录音中恢复未经处理的音轨。其挑战在于既要分离重叠的声源,又要重建因压缩、混响等制作效果而退化的信号。为此,我们提出DTT-BSR,一种混合生成对抗网络(GAN),它结合了用于长时域建模的旋转位置编码(RoPE)Transformer与用于多分辨率频谱处理的双路径频带分割循环神经网络(RNN)。我们的模型在ICASSP 2026 MSR挑战赛的客观排行榜上获得第3名,在主观排行榜上获得第4名,在仅7.1M参数的紧凑规模下,展现了卓越的生成保真度与语义对齐能力。