In this paper, we present a new dataset and benchmark tailored to the task of semantic similarity in song lyrics. Our dataset, originally consisting of 2775 pairs of Spanish songs, was annotated in a collective annotation experiment by 63 native annotators. After collecting and refining the data to ensure a high degree of consensus and data integrity, we obtained 676 high-quality annotated pairs that were used to evaluate the performance of various state-of-the-art monolingual and multilingual language models. Consequently, we established baseline results that we hope will be useful to the community in all future academic and industrial applications conducted in this context.
翻译:本文提出一个专为歌词语义相似性任务定制的新数据集与基准。该数据集初始包含2775对西班牙语歌曲,由63名母语标注者通过集体标注实验完成标注。经数据收集与精炼以确保高共识度与数据完整性后,我们获得676对高质量标注样本,并以此评估多种先进单语与多语言语言模型的性能。最终建立的基线结果,有望为该领域未来的学术研究与工业应用提供参考基准。