Inductive reasoning - the process of inferring general rules from a small number of observations - is a fundamental aspect of human intelligence. Recent works suggest that large language models (LLMs) can engage in inductive reasoning by sampling multiple hypotheses about the rules and selecting the one that best explains the observations. However, due to the IID sampling, semantically redundant hypotheses are frequently generated, leading to significant wastage of compute. In this paper, we 1) demonstrate that increasing the temperature to enhance the diversity is limited due to text degeneration issue, and 2) propose a novel method to improve the diversity while maintaining text quality. We first analyze the effect of increasing the temperature parameter, which is regarded as the LLM's diversity control, on IID hypotheses. Our analysis shows that as temperature rises, diversity and accuracy of hypotheses increase up to a certain point, but this trend saturates due to text degeneration. To generate hypotheses that are more semantically diverse and of higher quality, we propose a novel approach inspired by human inductive reasoning, which we call Mixture of Concepts (MoC). When applied to several inductive reasoning benchmarks, MoC demonstrated significant performance improvements compared to standard IID sampling and other approaches.
翻译:归纳推理——从少量观察中推断一般规则的过程——是人类智能的一个基本方面。近期研究表明,大型语言模型(LLMs)能够通过采样多个关于规则的假设并选择最能解释观察结果的假设来进行归纳推理。然而,由于采用独立同分布采样,语义冗余的假设频繁生成,导致计算资源的显著浪费。本文中,我们:1)证明通过提高温度来增强多样性因文本退化问题而受限;2)提出一种在保持文本质量的同时提升多样性的新方法。我们首先分析了提高温度参数(被视为LLM的多样性控制)对独立同分布假设的影响。分析表明,随着温度升高,假设的多样性和准确性在一定范围内提升,但由于文本退化,这一趋势会趋于饱和。为了生成语义上更具多样性且质量更高的假设,我们受人类归纳推理启发,提出了一种称为概念混合(Mixture of Concepts,MoC)的新方法。在多个归纳推理基准测试中应用MoC时,相较于标准独立同分布采样及其他方法,MoC表现出显著的性能提升。