A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency - poll the LLM multiple times and output the most frequent solution. Existing Self-Consistency techniques always draw a constant number of samples per question, where a better approach will be to non-uniformly distribute the available budget based on the amount of agreement in the samples drawn so far. In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion. Our experiments over 13 datasets and two LLMs demonstrate that Adaptive-Consistency reduces sample budget by up to 6.0 times with an average accuracy drop of less than 0.1%.
翻译:提升大语言模型输出正确性的常用方法是自一致性——多次采样大模型并输出最频繁的解。现有自一致性技术对每个问题均抽取固定数量的样本,而更优的方法是根据已采样样本的一致程度非均匀分配可用预算。为此,我们提出自适应一致性——一种成本高效、模型无关的技术,通过轻量级停止准则动态调整每个问题的采样数量。我们在13个数据集和两个大语言模型上的实验表明,自适应一致性可将采样预算降低最高6.0倍,同时平均准确率下降低于0.1%。