Synthetic data has become a cornerstone for scaling large language models, yet its multilingual use remains bottlenecked by translation-based prompts. This strategy inherits English-centric framing and style and neglects cultural dimensions, ultimately constraining model generalization. We argue that the overlooked prompt space-the very inputs that define training distributions-offers a more powerful lever for improving multilingual performance. We introduce a lightweight framework for prompt-space optimization, where translated prompts are systematically transformed for Naturalness, Cultural Adaptation, and Difficulty Enhancement. Using an off-the-shelf multilingual LLM, we apply these transformations to prompts for 12 languages spanning 7 families. Under identical data conditions, our approaches achieve substantial and consistent downstream improvements over the translation-only baseline: +4.7% on Global-MMLU accuracy, +2.4% on Flores XCometXL and +35.3% wins in preferences on mArenaHard. We establish prompt-space optimization as a simple yet powerful paradigm for building multilingual LLMs that are more robust, culturally grounded, and globally capable.
翻译:合成数据已成为扩展大型语言模型的基石,但其多语言应用仍受限于基于翻译的提示策略。这种方法沿袭了以英语为中心的框架与风格,忽视了文化维度,最终限制了模型的泛化能力。我们认为,被忽视的提示空间——即定义训练分布的输入本身——为提升多语言性能提供了更有效的途径。我们提出了一种轻量级的提示空间优化框架,通过系统化的转换使翻译后的提示在自然性、文化适应性与难度增强三个维度得到优化。利用现成的多语言大语言模型,我们将这些转换应用于涵盖7个语系、12种语言的提示集。在相同数据条件下,我们的方法相比纯翻译基线在下游任务中取得了显著且一致的改进:Global-MMLU准确率提升4.7%,Flores XCometXL提升2.4%,在mArenaHard偏好评估中胜率提高35.3%。本研究确立了提示空间优化作为一种简洁而强大的范式,可用于构建更具鲁棒性、文化适应性与全球胜任力的多语言大语言模型。