Live performances of music are always charming, with the unpredictability of improvisation due to the dynamic between musicians and interactions with the audience. Jazz improvisation is a particularly noteworthy example for further investigation from a theoretical perspective. Here, we introduce a novel mathematical game theory model for jazz improvisation, providing a framework for studying music theory and improvisational methodologies. We use computational modeling, mainly reinforcement learning, to explore diverse stochastic improvisational strategies and their paired performance on improvisation. We find that the most effective strategy pair is a strategy that reacts to the most recent payoff (Stepwise Changes) with a reinforcement learning strategy limited to notes in the given chord (Chord-Following Reinforcement Learning). Conversely, a strategy that reacts to the partner's last note and attempts to harmonize with it (Harmony Prediction) strategy pair yields the lowest non-control payoff and highest standard deviation, indicating that picking notes based on immediate reactions to the partner player can yield inconsistent outcomes. On average, the Chord-Following Reinforcement Learning strategy demonstrates the highest mean payoff, while Harmony Prediction exhibits the lowest. Our work lays the foundation for promising applications beyond jazz: including the use of artificial intelligence (AI) models to extract data from audio clips to refine musical reward systems, and training machine learning (ML) models on existing jazz solos to further refine strategies within the game.
翻译:现场音乐表演始终迷人,其即兴创作的不确定性源自乐手间的动态互动以及与观众的交流。爵士即兴表演是从理论视角进行深入探究的尤为值得关注的范例。本文引入一种新颖的爵士即兴创作数学博弈论模型,为研究音乐理论与即兴创作方法提供了框架。我们使用计算建模(主要为强化学习)探索多样化的随机即兴创作策略及其配对即兴表演效果。研究发现,最有效的策略组合是:对最新收益作出反应的策略(逐步变化)与将音符限制于给定和弦内的强化学习策略(和弦跟随强化学习)的配对。相反,根据搭档最近音符作出反应并试图与之协调的策略(和声预测)配对,产生的非控制收益最低且标准差最高,表明基于对搭档演奏者的即时反应选择音符可能导致不一致的结果。平均而言,和弦跟随强化学习策略展现出最高平均收益,而和声预测策略则表现最低。本研究为超越爵士乐领域的广阔应用奠定了基础,包括:利用人工智能模型从音频片段中提取数据以优化音乐奖励系统,以及基于现有爵士独奏数据训练机器学习模型以进一步优化博弈中的策略。