Output diversity is crucial for Large Language Models as it underpins pluralism and creativity. In this work, we reveal that controlling the language used during model thinking-the language of thought-provides a novel and structural source of output diversity. Our preliminary study shows that different thinking languages occupy distinct regions in a model's thinking space. Based on this observation, we study two repeated sampling strategies under multilingual thinking-Single-Language Sampling and Mixed-Language Sampling-and conduct diversity evaluation on outputs that are controlled to be in English, regardless of the thinking language used. Across extensive experiments, we demonstrate that switching the thinking language from English to non-English languages consistently increases output diversity, with a clear and consistent positive correlation such that languages farther from English in the thinking space yield larger gains. We further show that aggregating samples across multiple thinking languages yields additional improvements through compositional effects, and that scaling sampling with linguistic heterogeneity expands the model's diversity ceiling. Finally, we show that these findings translate into practical benefits in pluralistic alignment scenarios, leading to broader coverage of cultural knowledge and value orientations in LLM outputs. Our code is publicly available at https://github.com/iNLP-Lab/Multilingual-LoT-Diversity.
翻译:输出多样性对于大型语言模型至关重要,因为它支撑着多元性与创造力。本研究发现,控制模型思考过程中使用的语言——即思维语言——能够为输出多样性提供一种新颖且结构化的来源。初步研究表明,不同的思维语言在模型的思维空间中占据着不同的区域。基于这一观察,我们研究了多语言思维下的两种重复采样策略——单语言采样与混合语言采样,并对所有输出(无论使用何种思维语言)均控制为英语的情况进行了多样性评估。通过大量实验,我们证明将思维语言从英语切换为非英语语言能够持续提升输出多样性,且存在清晰一致的正相关关系:思维空间中距离英语越远的语言带来的增益越大。我们进一步表明,通过组合效应,聚合多种思维语言的样本能够产生额外改进;而通过语言异质性扩展采样规模可以提升模型的多样性上限。最后,我们证明这些发现在多元对齐场景中具有实际效益,能够使LLM输出覆盖更广泛的文化知识与价值取向。我们的代码公开于 https://github.com/iNLP-Lab/Multilingual-LoT-Diversity。