Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed to integrate this prior knowledge into probabilistic modeling typically limits the application of these models to specialists. Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations, guided by natural language text which describes a user's prior knowledge. Large Language Models (LLMs) provide a useful starting point for designing such a tool since they 1) provide an interface where users can incorporate expert insights in natural language and 2) provide an opportunity for leveraging latent problem-relevant knowledge encoded in LLMs that users may not have themselves. We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from LLMs. We examine these joint predictive distributions, which we call LLM Processes, over arbitrarily-many quantities in settings such as forecasting, multi-dimensional regression, black-box optimization, and image modeling. We investigate the practical details of prompting to elicit coherent predictive distributions, and demonstrate their effectiveness at regression. Finally, we demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions. This lets us begin to explore the rich, grounded hypothesis space that LLMs implicitly encode.
翻译:机器学习从业者在将先验知识与信念形式化整合到预测模型时,常面临重大挑战,这限制了进行细致入微且具有情境感知分析的潜力。此外,将此类先验知识融入概率建模所需的专业知识,通常将这些模型的应用局限在专家范围内。我们的目标是构建一种回归模型,该模型能够处理数值数据并在任意位置进行概率预测,并以描述用户先验知识的自然语言文本为指导。大语言模型为设计此类工具提供了有用的起点,因为它们:1)提供了一个允许用户以自然语言形式融入专家见解的接口;2)提供了利用LLM中编码的、用户自身可能不具备的、与问题相关的潜在知识的机会。我们首先探索从LLM中引出明确、连贯的数值预测分布的策略。我们研究了这些联合预测分布——我们称之为LLM过程——在诸如预测、多维回归、黑盒优化和图像建模等场景中,对任意多数量上的表现。我们深入研究了通过提示引出连贯预测分布的实际细节,并展示了它们在回归任务中的有效性。最后,我们展示了将文本有效融入数值预测的能力,从而提升了预测性能,并提供了反映定性描述的定量结构。这使我们得以开始探索LLM所隐含编码的、丰富且基于现实的假设空间。