As generative platforms such as Suno and Udio reach human-grade audio quality, the scope of AI's utility has expanded across the entire music production workflow. Beyond simple track generation, these advancements have catalyzed the adoption of AI-driven methodologies in diverse forms. These include vocal synthesis, arrangement, and professional mastering. However, current detection research remains largely confined to a binary `AI-or-human' paradigm. It fails to reflect the realities of contemporary music production workflows. In real-world production, AI tools are increasingly used to refine or master human-produced tracks, and human engineers likewise post-process AI-generated material to ensure professional quality. Moreover, users often employ adversarial tactics to bypass AI detectors, such as applying human mastering to AI-generated tracks. This creates a grey area that a simple binary classification fails to capture. In this paper, we define and investigate ``AI Music Tracking'': the challenge of identifying specific AI integration across the multifaceted spectrum of music production. To this end, we introduce HAIM, a dataset with diverse labels for stages of music production. It is designed to isolate stages of AI intervention, including hybrid production and agent-level tracking. Our evaluation of state-of-the-art detectors reveals systemic flaws. By releasing HAIM, we propose a new benchmark that shifts the field beyond binary classification toward a granular, structured evaluation of AI music.
翻译:随着Suno和Udio等生成式平台达到人类级别的音频质量,AI的应用范围已扩展至整个音乐制作流程。除简单的音轨生成外,这些进步推动了AI驱动方法在多种形式的采用,包括声音合成、编曲和专业母带处理。然而,当前的检测研究仍主要局限于“AI或人类”的二元范式,未能反映当代音乐制作流程的现实。实际制作中,AI工具越来越多地被用于精炼或处理人类制作的音轨,而人类工程师同样会对AI生成的材料进行后期处理以确保专业质量。此外,用户常采用对抗性策略(如对AI生成音轨进行人工母带处理)以规避AI检测器,这形成了二元分类无法涵盖的灰色地带。本文定义并研究了“AI音乐追踪”——即在多维音乐制作谱系中识别特定AI集成的挑战。为此,我们提出了HAIM数据集,包含音乐制作各阶段的多样化标签,旨在隔离AI干预阶段,包括混合制作与代理级追踪。我们对现有最先进检测器的评估揭示了系统性缺陷。通过发布HAIM,我们提出了一项新基准,推动该领域从二元分类转向对AI音乐的精细化、结构化评估。