The landscape of Large Language Models remains predominantly English-centric, resulting in a significant performance gap for other major languages, such as French, especially in the context of Small Language Models (SLMs). Existing multilingual models demonstrate considerably lower performance in French compared to English, and research on efficient adaptation methods for French remains limited. To address this, we introduce \textbf{Luth}, a family of French-specialized SLMs: through targeted post-training on curated, high-quality French data, our models outperform all open-source counterparts of comparable size on multiple French benchmarks while retaining their original English capabilities. We further show that strategic model merging enhances performance in both languages, establishing Luth as a new state of the art for French SLMs and a robust baseline for future French-language research.
翻译:大型语言模型的格局仍以英语为中心,导致其他主要语言(如法语)在性能上存在显著差距,尤其是在小型语言模型(SLMs)领域。现有的多语言模型在法语上的表现远低于英语,且针对法语的高效适配方法研究仍然有限。为此,我们提出了 \textbf{Luth} 系列法语专业化 SLMs:通过对精选的高质量法语数据进行针对性后训练,我们的模型在多个法语基准测试中超越了所有同规模的开源模型,同时保持了原有的英语能力。我们进一步证明,策略性的模型融合能够提升双语性能,使 Luth 成为法语 SLMs 的新技术标杆,并为未来的法语语言研究奠定了坚实的基线。