Bob's Confetti: Phonetic Memorization Attacks in Music and Video Generation

Generative AI systems for music and video commonly use text-based filters to prevent regurgitation of copyrighted material. We expose a significant vulnerability in this approach by introducing Adversarial PhoneTic Prompting (APT), a novel attack that bypasses these safeguards by exploiting phonetic memorization--the tendency of models to bind sub-lexical acoustic patterns (phonemes, rhyme, stress, cadence) to memorized copyrighted content. APT replaces iconic lyrics with homophonic but semantically unrelated alternatives (e.g., "mom's spaghetti" becomes "Bob's confetti"), preserving phonetic structure while evading lexical filters. We evaluate APT on leading lyrics-to-song models (Suno, YuE) across English and Korean songs spanning rap, pop, and K-pop. APT achieves 91% average similarity to copyrighted originals, versus 13.7% for random lyrics and 42.2% for semantic paraphrases. Embedding analysis confirms the mechanism: YuE's text encoder treats APT-modified lyrics as near-identical to originals (cosine similarity 0.90) while Sentence-BERT semantic similarity drops to 0.71, showing the model encodes phonetic structure over meaning. This vulnerability extends cross-modally--Veo 3 reconstructs visual scenes from original music videos when prompted with APT lyrics alone, despite no visual cues in the prompt. We further show that phonetic-semantic defense signatures fail, as APT prompts exhibit higher semantic similarity than benign paraphrases. Our findings reveal that sub-lexical acoustic structure acts as a cross-modal retrieval key, rendering current copyright filters systematically vulnerable. Demo examples are available at https://jrohsc.github.io/music_attack/.

翻译：音乐与视频生成式人工智能系统通常采用基于文本的过滤器来防止受版权保护内容的复现。本文通过引入对抗性语音提示（APT）这一新型攻击方法，揭示了该方法的重大安全漏洞。APT通过利用语音记忆——即模型将亚词汇声学模式（音素、韵律、重音、节奏）与记忆中的受版权保护内容相绑定的倾向——来绕过这些防护机制。APT将标志性歌词替换为同音异义但语义无关的替代词（例如将"mom's spaghetti"改为"Bob's confetti"），在保留语音结构的同时规避了词汇过滤器。我们在英语和韩语歌曲（涵盖说唱、流行和K-pop类型）上对主流歌词转歌曲模型（Suno、YuE）进行了APT评估。APT与受版权保护原作的相似度平均达到91%，而随机歌词仅为13.7%，语义改写为42.2%。嵌入分析证实了其机制：YuE的文本编码器将APT修改后的歌词视为与原作近乎相同（余弦相似度0.90），而Sentence-BERT语义相似度降至0.71，表明模型编码的是语音结构而非语义。该漏洞具有跨模态扩展性——当仅使用APT歌词提示时，Veo 3能够重建原始音乐视频中的视觉场景，尽管提示中未包含任何视觉线索。我们进一步证明语音-语义防御特征失效，因为APT提示比良性改写表现出更高的语义相似度。我们的研究揭示，亚词汇声学结构充当了跨模态检索密钥，导致当前版权过滤器存在系统性漏洞。演示示例详见https://jrohsc.github.io/music_attack/。