Output diversity is crucial for Large Language Models as it underpins pluralism and creativity. In this work, we reveal that controlling the language used during model thinking-the language of thought-provides a novel and structural source of output diversity. Our preliminary study shows that different thinking languages occupy distinct regions in a model's thinking space. Based on this observation, we study two repeated sampling strategies under multilingual thinking-Single-Language Sampling and Mixed-Language Sampling-and conduct diversity evaluation on outputs that are controlled to be in English, regardless of the thinking language used. Across extensive experiments, we demonstrate that switching the thinking language from English to non-English languages consistently increases output diversity, with a clear and consistent positive correlation such that languages farther from English in the thinking space yield larger gains. We further show that aggregating samples across multiple thinking languages yields additional improvements through compositional effects, and that scaling sampling with linguistic heterogeneity expands the model's diversity ceiling. Finally, we show that these findings translate into practical benefits in pluralistic alignment scenarios, leading to broader coverage of cultural knowledge and value orientations in LLM outputs. Our code is publicly available at https://github.com/iNLP-Lab/Multilingual-LoT-Diversity.
翻译:输出多样性对于大型语言模型至关重要,因为它支撑着多元性和创造力。在本文中,我们揭示控制模型思考过程中使用的语言——即思考语言——为输出多样性提供了新颖且结构化的来源。初步研究表明,不同的思考语言在模型的思考空间中占据不同的区域。基于这一观察,我们研究了多语言思考下的两种重复采样策略——单语言采样和多语言采样——并对输出进行多样性评估,这些输出被控制为英文,无论思考语言为何。通过大量实验,我们证明将思考语言从英语切换到非英语语言会持续增加输出多样性,且存在清晰一致的正相关关系:思考空间中与英语距离越远的语言,带来的提升越大。我们进一步表明,跨多种思考语言聚合样本可通过组合效应带来额外改进,而通过语言异质性扩展采样可提高模型的多样性上限。最后,我们展示了这些发现在多元对齐场景中具有实际益处,可使大型语言模型输出的文化知识和价值取向覆盖范围更广。我们的代码已在 https://github.com/iNLP-Lab/Multilingual-LoT-Diversity 公开。