Existing pitch curve generators face two main challenges: they often neglect singer-specific expressiveness, reducing their ability to capture individual singing styles. And they are typically developed as auxiliary modules for specific tasks such as pitch correction, singing voice synthesis, or voice conversion, which restricts their generalization capability. We propose StylePitcher, a general-purpose pitch curve generator that learns singer style from reference audio while preserving alignment with the intended melody. Built upon a rectified flow matching architecture, StylePitcher flexibly incorporates symbolic music scores and pitch context as conditions for generation, and can seamlessly adapt to diverse singing tasks without retraining. Objective and subjective evaluations across various singing tasks demonstrate that StylePitcher improves style similarity and audio quality while maintaining pitch accuracy comparable to task-specific baselines.
翻译:现有的音高曲线生成器面临两大挑战:它们常常忽略歌手特有的表现力,从而削弱了捕捉个体演唱风格的能力;并且它们通常被开发为特定任务(如音高校正、歌声合成或语音转换)的辅助模块,这限制了其泛化能力。我们提出了StylePitcher,一种通用音高曲线生成器,它能够从参考音频中学习歌手风格,同时保持与目标旋律的对齐。基于修正流匹配架构构建,StylePitcher灵活地将符号乐谱和音高上下文作为生成条件,并且无需重新训练即可无缝适应多样化的歌唱任务。在多种歌唱任务上的客观和主观评估表明,StylePitcher在保持与任务特定基线相当的音高准确性的同时,提升了风格相似度和音频质量。