Fine-Tuning Large Language Models for Automatic Detection of Sexually Explicit Content in Spanish-Language Song Lyrics

The proliferation of sexually explicit content in popular music genres such as reggaeton and trap, consumed predominantly by young audiences, has raised significant societal concern regarding the exposure of minors to potentially harmful lyrical material. This paper presents an approach to the automatic detection of sexually explicit content in Spanish-language song lyrics by fine-tuning a Generative Pre-trained Transformer (GPT) model on a curated corpus of 100 songs, evenly divided between expert-labeled explicit and non-explicit categories. The proposed methodology leverages transfer learning to adapt the pre-trained model to the idiosyncratic linguistic features of urban Latin music, including slang, metaphors, and culturally specific double entendres that evade conventional dictionary-based filtering systems. Experimental evaluation on held-out test sets demonstrates that the fine-tuned model achieves 87% accuracy, 100% precision, and 100% specificity after a feedback-driven refinement loop, outperforming both its pre-feedback configuration and a non-customized baseline ChatGPT model. A comparative analysis reveals that the fine-tuned model agrees with expert human classification in 59.2% of cases versus 55.1% for the standard model, confirming that domain-specific adaptation enhances sensitivity to implicit and culturally embedded sexual references. These findings support the viability of deploying fine-tuned large language models as automated content moderation tools on music streaming platforms. Building on these technical results, the paper develops a public policy proposal for a multi-tier age-based content rating system for music analogous to the PEGI system for video games analyzed through the PESTEL framework and Kingdon's Multiple Streams Framework, establishing both the technological feasibility and the policy pathway for systematic music content regulation.

翻译：以雷鬼顿和陷阱音乐为代表的流行音乐流派中露骨性内容的泛滥，因其主要受众为年轻群体，已引发社会对未成年人接触潜在有害歌词材料的严重关切。本文提出一种通过微调生成式预训练Transformer（GPT）模型来自动检测西班牙语歌词中露骨性内容的方法。该方法基于包含100首歌曲的精选语料库进行微调，语料库经专家标注为露骨与非露骨两类且数量均衡。所提出的方法利用迁移学习使预训练模型适应拉丁都市音乐特有的语言特征，包括俚语、隐喻以及规避传统词典过滤系统的文化特定双关语。在预留测试集上的实验评估表明，经过反馈驱动优化循环后的微调模型达到87%的准确率、100%的精确率和100%的特异性，其性能优于反馈前配置及未经定制的基线ChatGPT模型。对比分析显示，微调模型与专家人工分类的一致性达59.2%，而标准模型为55.1%，证实领域特定适配能提升对隐晦及文化嵌入性暗示的敏感度。这些发现支持了在音乐流媒体平台部署微调大语言模型作为自动化内容审核工具的可行性。基于技术成果，本文进一步通过PESTEL分析框架和金登多源流理论，提出类比视频游戏PEGI分级系统的多层级年龄音乐内容分级公共政策方案，为系统性音乐内容监管确立了技术可行性与政策实施路径。