Artificial Intelligence (AI) song generation has emerged as a popular topic, yet the focus on exploring the latent correlations between specific lyrical and rhythmic features remains limited. In contrast, this pilot study particularly investigates the relationships between keywords and rhythmically stressed features such as strong beats in songs. It focuses on several key elements: keywords or non-keywords, stressed or unstressed syllables, and strong or weak beats, with the aim of uncovering insightful correlations. Experimental results indicate that, on average, 80.8\% of keywords land on strong beats, whereas 62\% of non-keywords fall on weak beats. The relationship between stressed syllables and strong or weak beats is weak, revealing that keywords have the strongest relationships with strong beats. Additionally, the lyrics-rhythm matching score, a key matching metric measuring keywords on strong beats and non-keywords on weak beats across various time signatures, is 0.765, while the matching score for syllable types is 0.495. This study demonstrates that word types strongly align with their corresponding beat types, as evidenced by the distinct patterns, whereas syllable types exhibit a much weaker alignment. This disparity underscores the greater reliability of word types in capturing rhythmic structures in music, highlighting their crucial role in effective rhythmic matching and analysis. We also conclude that keywords that consistently align with strong beats are more reliable indicators of lyrics-rhythm associations, providing valuable insights for AI-driven song generation through enhanced structural analysis. Furthermore, our development of tailored Lyrics-Rhythm Matching (LRM) metrics maximizes lyrical alignments with corresponding beat stresses, and our novel LRM file format captures critical lyrical and rhythmic information without needing original sheet music.
翻译:人工智能(AI)歌曲生成已成为热门研究课题,然而针对特定歌词特征与节奏特征之间潜在关联的探索仍较为有限。与此不同,本项探索性研究特别关注歌曲中关键词与节奏重音特征(如强拍)之间的关系。研究聚焦于几个关键要素:关键词与非关键词、重读音节与非重读音节、强拍与弱拍,旨在揭示具有启发性的关联规律。实验结果表明,平均80.8%的关键词落在强拍上,而62%的非关键词位于弱拍。重读音节与强弱拍之间的关联较弱,这表明关键词与强拍之间存在最强的相关性。此外,歌词-节奏匹配度作为衡量不同拍号下关键词落于强拍、非关键词落于弱拍的关键匹配指标,其得分为0.765,而音节类型的匹配得分仅为0.495。本研究证明:词语类型与其对应拍型呈现高度一致性(其显著分布模式可为此佐证),而音节类型的对应关系则弱得多。这种差异凸显了词语类型在捕捉音乐节奏结构方面具有更高的可靠性,表明其在实现有效节奏匹配与分析中的关键作用。我们还得出结论:持续与强拍对齐的关键词能更可靠地指示歌词-节奏关联,这通过增强结构分析为AI驱动的歌曲生成提供了宝贵洞见。此外,我们开发的定制化歌词-节奏匹配(LRM)指标能最大化歌词与对应节拍重音的对齐程度,而新颖的LRM文件格式可在无需原始乐谱的情况下捕获关键的歌词与节奏信息。