Driven by the escalating global burden of mental health conditions, music-based interventions have attracted significant attention as a non-invasive, cost-effective modality for emotion regulation and psychological stress relief. However, current digital music services rely on static preferences and fail to adapt to users' instantaneous psychological states. Furthermore, directly mapping electroencephalography (EEG) to music generation remains challenging due to severe paired-data scarcity and a lack of interpretability. To address these limitations, we propose MindMelody, a fully functional, closed-loop real-time system for EEG-driven personalized music intervention. MindMelody introduces an emotion-mediated semantic bridge. Specifically, a hybrid Transformer-GNN first decodes real-time EEG signals into global Valence-Arousal states and local temporal affect trajectories. These states are then fed into a Retrieval-Augmented Generation (RAG)-equipped Large Language Model (LLM) to formulate structured intervention plans. Subsequently, a novel Hierarchical EEG Controller injects global affect prefixes and local temporal guidance into a pretrained music backbone, enabling fine-grained controllable audio synthesis. Crucially, the system incorporates a continuous feedback loop that updates generation parameters on the fly based on the user's evolving EEG dynamics. Extensive experiments show that MindMelody improves control adherence and emotional alignment, and receives higher perceived helpfulness in a short-term listening setting, suggesting its promise as an adaptive affect-aware music generation framework.
翻译:随着全球心理健康问题日益严峻,基于音乐的干预作为一种非侵入性、高成本效益的情绪调节和心理压力缓解方式引起了广泛关注。然而,当前数字音乐服务依赖于静态偏好,未能适应用户的即时心理状态。此外,由于配对数据严重不足且缺乏可解释性,直接将脑电图映射到音乐生成仍面临挑战。为解决这些局限,我们提出MindMelody,一个功能完备的实时闭环系统,用于脑电图驱动的个性化音乐干预。MindMelody引入了一种情绪中介的语义桥梁。具体而言,混合Transformer-GNN首先将实时脑电图信号解码为全局效价-唤醒状态和局部时间情感轨迹。这些状态随后被输入配备检索增强生成的大型语言模型,以制定结构化干预方案。接着,一种新颖的分层脑电图控制器将全局情感前缀和局部时间引导注入预训练的音乐骨干网络,实现细粒度的可控音频合成。关键的是,该系统包含一个持续反馈回路,可根据用户不断变化的脑电图动态实时更新生成参数。大量实验表明,MindMelody在短期聆听场景中提高了控制一致性和情感对齐度,并获得了更高的感知有用性,表明其作为自适应情感感知音乐生成框架的潜力。