Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed to integrate this prior knowledge into probabilistic modeling typically limits the application of these models to specialists. Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations, guided by natural language text which describes a user's prior knowledge. Large Language Models (LLMs) provide a useful starting point for designing such a tool since they 1) provide an interface where users can incorporate expert insights in natural language and 2) provide an opportunity for leveraging latent problem-relevant knowledge encoded in LLMs that users may not have themselves. We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from LLMs. We examine these joint predictive distributions, which we call LLM Processes, over arbitrarily-many quantities in settings such as forecasting, multi-dimensional regression, black-box optimization, and image modeling. We investigate the practical details of prompting to elicit coherent predictive distributions, and demonstrate their effectiveness at regression. Finally, we demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions. This lets us begin to explore the rich, grounded hypothesis space that LLMs implicitly encode.
翻译:机器学习从业者在将先验知识与信念正式融入预测模型时往往面临重大挑战,这限制了实现具有细微差异和上下文感知分析的可能性。此外,将先验知识整合到概率建模所需的技术专长通常将这些模型的应用局限于专家群体。我们旨在构建一个回归模型,使其能够处理数值数据并在任意位置进行概率预测,同时受描述用户先验知识的自然语言文本引导。大型语言模型(LLMs)为设计此类工具提供了有价值的起点,因为:1)它提供的交互界面允许用户以自然语言融入专家见解;2)它提供了利用LLM中编码的、用户可能不具备的潜在问题相关知识的机会。我们首先探索从LLM中提取显式、连贯的数值预测分布的策略。在预测、多维回归、黑箱优化和图像建模等场景中,研究了这些被称为"LLM过程"的任意数量变量联合预测分布。我们探讨了通过提示工程获取连贯预测分布的具体实践细节,并验证了其在回归任务中的有效性。最终,我们展示了将文本有效融入数值预测的能力,不仅提升了预测性能,还构建出反映定性描述的定量结构。这使我们得以初步探索LLM隐式编码的丰富且具实证基础的假设空间。