Prediction can be safely used as a proxy for explanation in causally consistent Bayesian generalized linear models

Bayesian modeling provides a principled approach to quantifying uncertainty in model parameters and model structure and has seen a surge of applications in recent years. Within the context of a Bayesian workflow, we are concerned with model selection for the purpose of finding models that best explain the data, that is, help us understand the underlying data generating process. Since we rarely have access to the true process, all we are left with during real-world analyses is incomplete causal knowledge from sources outside of the current data and model predictions of said data. This leads to the important question of when the use of prediction as a proxy for explanation for the purpose of model selection is valid. We approach this question by means of large-scale simulations of Bayesian generalized linear models where we investigate various causal and statistical misspecifications. Our results indicate that the use of prediction as proxy for explanation is valid and safe only when the models under consideration are sufficiently consistent with the underlying causal structure of the true data generating process.

翻译：贝叶斯建模为量化模型参数和模型结构中的不确定性提供了原则性方法，近年来其应用激增。在贝叶斯工作流程的背景下，我们关注模型选择，旨在寻找最能解释数据（即帮助我们理解底层数据生成过程）的模型。由于我们很少能接触真实过程，实际分析中仅能依赖源自当前数据之外的外部因果知识及对这些数据的模型预测。这引出一个重要问题：何时将预测作为解释的代理用于模型选择是可行的。我们通过大规模贝叶斯广义线性模型仿真来探讨此问题，研究各种因果和统计错误设定。结果表明，仅当所考虑的模型与真实数据生成过程的底层因果结构充分一致时，将预测作为解释的代理才是有效且安全的。

相关内容

MoDELS

关注 46

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日