We present a systematic study of multilingual polarization detection across 22 languages for SemEval-2026 Task 9 (Subtask 1), contrasting multilingual generalists with language-specific specialists and hybrid ensembles. While a standard generalist like XLM-RoBERTa suffices when its tokenizer aligns with the target text, it may struggle with distinct scripts (e.g., Khmer, Odia) where monolingual specialists yield significant gains. Rather than enforcing a single universal architecture, we adopt a language-adaptive framework that switches between multilingual generalists, language-specific specialists, and hybrid ensembles based on development performance. Additionally, cross-lingual augmentation via NLLB-200 yielded mixed results, often underperforming native architecture selection and degrading morphologically rich tracks. Our final system achieves an overall macro-averaged F1 score of 0.796 and an average accuracy of 0.826 across all 22 tracks. Code and final test predictions are publicly available at: https://github.com/Maziarkiani/SemEval2026-Task9-Subtask1-Polarization.
翻译:本文针对SemEval-2026任务9(子任务1)中涵盖22种语言的多语言极化检测问题,系统比较了多语言通用模型、语言专用模型及混合集成策略的效果。研究表明,当XLM-RoBERTa等标准通用模型的词元化器与目标文本对齐时效果良好,但其在处理高棉语、奥里亚语等特殊文字体系时面临困难——此时单语专用模型可获得显著性能提升。我们并未采用单一通用架构,而是构建了语言自适应框架,该框架能够根据开发集性能在通用多语言模型、语言专用模型及混合集成策略之间动态切换。此外,基于NLLB-200的跨语言数据增强技术产生了混合效果,其表现往往不及原生架构选择,甚至会降低形态丰富语言轨道的性能。最终系统在全部22个轨道的总体宏平均F1分数为0.796,平均准确率为0.826。相关代码与最终测试预测结果已在以下链接公开:https://github.com/Maziarkiani/SemEval2026-Task9-Subtask1-Polarization。