A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always generate a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples generated so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 17 reasoning and code generation datasets and three LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%. Our code and data are available at https://www.sample-step-by-step.info
翻译:提升大语言模型(LLM)输出正确性的流行方法之一是自一致性——多次采样大语言模型并输出最频繁的解。现有自一致性技术为每个问题生成恒定数量的样本,而更优的做法是基于已生成样本的一致程度非均匀地分配可用预算。为此,我们提出自适应一致性(Adaptive-Consistency),一种经济高效且与模型无关的技术,通过轻量级停止准则动态调整每个问题的采样数量。我们在17个推理与代码生成数据集及三个大语言模型上的实验表明,自适应一致性可将样本预算降低至多7.9倍,且平均准确率下降不足0.1%。我们的代码与数据详见 https://www.sample-step-by-step.info