The generation of handwritten music sheets is a crucial step toward enhancing Optical Music Recognition (OMR) systems, which rely on large and diverse datasets for optimal performance. However, handwritten music sheets, often found in archives, present challenges for digitisation due to their fragility, varied handwriting styles, and image quality. This paper addresses the data scarcity problem by applying Generative Adversarial Networks (GANs) to synthesise realistic handwritten music sheets. We provide a comprehensive evaluation of three GAN models - DCGAN, ProGAN, and CycleWGAN - comparing their ability to generate diverse and high-quality handwritten music images. The proposed CycleWGAN model, which enhances style transfer and training stability, significantly outperforms DCGAN and ProGAN in both qualitative and quantitative evaluations. CycleWGAN achieves superior performance, with an FID score of 41.87, an IS of 2.29, and a KID of 0.05, making it a promising solution for improving OMR systems.
翻译:手写乐谱的生成是提升光学音乐识别(OMR)系统的关键步骤,因为OMR系统依赖大规模多样化数据集以实现最佳性能。然而,档案中常见的手写乐谱因其脆弱性、多样的书写风格和图像质量问题,给数字化工作带来了挑战。本文通过应用生成对抗网络(GAN)合成逼真的手写乐谱,以解决数据稀缺问题。我们对三种GAN模型——DCGAN、ProGAN和CycleWGAN——进行了综合评估,比较它们生成多样化高质量手写乐谱图像的能力。所提出的CycleWGAN模型通过增强风格迁移和训练稳定性,在定性与定量评估中均显著优于DCGAN和ProGAN。CycleWGAN取得了优异的性能指标:FID分数为41.87,IS为2.29,KID为0.05,这使其成为改进OMR系统的可行解决方案。