The quality of piano performance depends on nuanced timing, articulation, and dynamic control, but practice feedback is often summary-based and hard to act on. We introduce Profy, a weakly supervised system that learns from take-level labels derived from aggregated listener ratings (expert-labeled vs. amateur-labeled) to produce time-aligned highlights for review during piano practice. We collected synchronized 1 kHz key-motion and audio from 73 pianists and used 1,083 valid takes for modeling and evaluation. The model outputs clip-level predictions together with evidence scores on a shared resampled model time base for visualization. On 20 amateur clips from short technique studies annotated by 21 expert pianists, the displayed highlight score aligns with passages that expert pianists marked for review despite training without localized labels (Pearson r=0.61, ROC-AUC 0.75). Rather than summarizing a take with a single global score, Profy helps learners decide where to inspect next by supporting scrubbing, looping, and focused replay of time-localized passages associated with expert-amateur differences.
翻译:钢琴演奏质量取决于精细的时机把握、触键方式和力度控制,但练习反馈通常基于概括性总结且难以付诸实践。我们提出Profy——一种弱监督系统,该系统通过聚合听众评分(专家标注与非专家标注)学习片段级标签,生成时间对齐的高亮标记以供钢琴练习时回顾。我们采集了73位钢琴演奏者的同步1 kHz琴键运动与音频数据,并使用1,083个有效片段进行建模与评估。模型输出片段级预测结果及基于共享重采样模型时间基线的证据分数,用于可视化呈现。在21位钢琴专家标注的20个短技法练习业余片段中,尽管训练未使用局部化标签,系统显示的高亮分数仍与专家标注的待改进段落保持一致(皮尔逊相关系数r=0.61,ROC-AUC 0.75)。不同于用单一全局分数概括演奏片段,Profy通过支持对专家-业余差异相关的时间局部化段落进行浏览、循环播放和聚焦回放,帮助学习者决策下一步应重点检查的部位。