Escaping the Linearity Trap: Manifold Detours for Black-Box Adversarial Attacks on Singing Audio Deepfake Detection

Recent Singing Voice Synthesis (SVS) advances enable highly realistic but potentially malicious AI covers, making singing voice deepfake detection (SVDD) crucial. Self-Supervised Learning (SSL)-based detectors achieve state-of-the-art performance by fine-tuning speech SSL backbones to capture singing-specific spoof artifacts. Existing adversarial attacks often fail against SSL-SVDD, creating a false impression of inherent robustness. We reveal this stems from two challenges. First, at the objective level, attacks optimize cross-entropy on local surrogates, crossing surrogate-specific boundaries rather than suppressing shared spoof evidence. Second, at the method level, attacks follow the surrogate's dominant gradient direction. In SSL-SVDD, this aligns with fine-tuned artifact-sensitive directions, limiting transferability to unseen detectors - a geometric failure we term the Linearity Trap. To properly evaluate robustness, we propose MARS (Meta-Adversarial Regression of Semantics), a transfer-based black-box framework tailored to SSL-SVDD. Structurally, MARS shifts to hypothesis-evidence manipulation by constructing a natural semantic anchor from the pre-trained SSL space and an artifact anchor from the fine-tuned space. Algorithmically, MARS escapes the Linearity Trap via bi-level optimization: the inner stage induces tangential exploration, while the outer stage guides the audio toward the natural semantic manifold. Experiments on the CtrSVDD benchmark show MARS improves Attack Success Rate (ASR) in in-distribution transfer (13%), out-of-distribution transfer (10%), and cross-task evaluation (36%), highlighting the urgent need for robust SVDD systems.

翻译：近期，歌唱语音合成（SVS）技术的进步使得高度逼真但可能带有恶意的人工智能翻唱成为可能，这使得歌唱语音深度伪造检测（SVDD）变得至关重要。基于自监督学习（SSL）的检测器通过微调语音自监督学习骨干网络来捕捉歌唱特有的伪造痕迹，达到了最先进的性能。现有对抗攻击方法在面对基于自监督学习的歌唱语音深度伪造检测（SSL-SVDD）时往往失效，从而产生了一种固有的鲁棒性的错误印象。我们发现这源于两个挑战。首先，在目标层面，攻击在本地代理模型上优化交叉熵，跨过代理特定的决策边界，而非抑制共同的伪造证据。其次，在方法层面，攻击遵循代理模型的主导梯度方向。在SSL-SVDD中，这与微调后对伪造痕迹敏感的梯度方向一致，从而限制了向未见检测器的可迁移性——我们将这种几何上的失败称为“线性陷阱”。为了正确评估鲁棒性，我们提出了MARS（语义元对抗回归），一个专为SSL-SVDD设计的基于迁移的黑盒攻击框架。在结构上，MARS通过从预训练的SSL空间中构建一个自然语义锚点，并从微调空间中构建一个伪造痕迹锚点，将攻击策略转向假设-证据操纵。在算法上，MARS通过双层优化逃离线性陷阱：内层阶段诱发切向探索，而外层阶段引导音频向自然语义流形移动。在CtrSVDD基准上的实验表明，MARS在分布内迁移（13%）、分布外迁移（10%）和跨任务评估（36%）中均提升了攻击成功率（ASR），凸显了构建鲁棒SVDD系统的紧迫性。