Specification of the prior distribution for a Bayesian model is a central part of the Bayesian workflow for data analysis, but it is often difficult even for statistical experts. In principle, prior elicitation transforms domain knowledge of various kinds into well-defined prior distributions, and offers a solution to the prior specification problem. In practice, however, we are still fairly far from having usable prior elicitation tools that could significantly influence the way we build probabilistic models in academia and industry. We lack elicitation methods that integrate well into the Bayesian workflow and perform elicitation efficiently in terms of costs of time and effort. We even lack a comprehensive theoretical framework for understanding different facets of the prior elicitation problem. Why are we not widely using prior elicitation? We analyse the state of the art by identifying a range of key aspects of prior knowledge elicitation, from properties of the modelling task and the nature of the priors to the form of interaction with the expert. The existing prior elicitation literature is reviewed and categorized in these terms. This allows recognizing under-studied directions in prior elicitation research, finally leading to a proposal of several new avenues to improve prior elicitation methodology.
翻译:贝叶斯模型先验分布的设定是数据分析贝叶斯工作流的核心环节,但即便对统计专家而言也常存在困难。原则上,先验提取将各种领域知识转化为明确定义的先验分布,为先验设定问题提供了解决方案。然而在实践中,我们距离拥有能够显著影响学术界与工业界概率建模方式的可用先验提取工具仍相当遥远。我们缺乏能良好融入贝叶斯工作流、且在时间与精力成本上高效执行的提取方法,甚至缺乏用于理解先验提取问题不同层面的综合理论框架。为何我们未能广泛使用先验提取?本文通过识别先验知识提取的一系列关键维度(从建模任务特性、先验本质到与专家交互的形式),对现有研究现状进行分析。我们按此框架对现有先验提取文献进行梳理与分类,从而揭示该领域研究中尚待深入探索的方向,最终提出若干改进先验提取方法论的新路径。