We present a state-of-the-art model for fine-grained probability estimation of propositions conditioned on context. Recent advances in large language models (LLMs) have significantly enhanced their reasoning capabilities, particularly on well-defined tasks with complete information. However, LLMs continue to struggle with making accurate and well-calibrated probabilistic predictions under uncertainty or partial information. While incorporating uncertainty into model predictions often boosts performance, obtaining reliable estimates of that uncertainty remains understudied. In particular, LLM probability estimates tend to be coarse and biased towards more frequent numbers. Through a combination of human and synthetic data creation and assessment, scaling to larger models, and better supervision, we propose a set of strong and precise probability estimation models. We conduct systematic evaluations across tasks that rely on conditional probability estimation and show that our approach consistently outperforms existing fine-tuned and prompting-based methods by a large margin.
翻译:我们提出了一种最先进的模型,用于在上下文条件下对命题进行细粒度概率估计。大语言模型(LLMs)的最新进展显著增强了其推理能力,尤其是在信息完整的明确定义任务中。然而,在不确定性或部分信息条件下,大语言模型在做出准确且校准良好的概率预测方面仍然存在困难。虽然将不确定性纳入模型预测通常能提升性能,但如何获得这些不确定性的可靠估计仍未得到充分研究。特别地,大语言模型的概率估计往往较为粗糙,并偏向于更常见的数字。通过结合人工与合成数据的创建和评估、扩展到更大模型以及更好的监督,我们提出了一组强大且精确的概率估计模型。我们在依赖条件概率估计的任务上进行了系统评估,结果表明,我们的方法持续大幅优于现有的微调和基于提示的方法。