Large language models (LLMs) have been extensively studied for their abilities to generate convincing natural language sequences, however their utility for quantitative information retrieval is less well understood. In this paper we explore the feasibility of LLMs as a mechanism for quantitative knowledge retrieval to aid data analysis tasks such as elicitation of prior distributions for Bayesian models and imputation of missing data. We present a prompt engineering framework, treating an LLM as an interface to a latent space of scientific literature, comparing responses in different contexts and domains against more established approaches. Implications and challenges of using LLMs as 'experts' are discussed.
翻译:大型语言模型(LLMs)因其生成令人信服的自然语言序列的能力而被广泛研究,但其在定量信息检索方面的效用尚不明确。本文探讨了将LLMs作为定量知识检索机制的可行性,以辅助数据分析任务,例如为贝叶斯模型提取先验分布和填补缺失数据。我们提出了一种提示工程框架,将LLM视为科学文献潜在空间的接口,并在不同情境和领域中将响应与更成熟的方法进行比较。本文还讨论了将LLMs作为“专家”使用的意义与挑战。