Visualizations play a critical role in validating and improving statistical models. However, the design space of model check visualizations is not well understood, making it difficult for authors to explore and specify effective graphical model checks. VMC defines a model check visualization using four components: (1) samples of distributions of checkable quantities generated from the model, including predictive distributions for new data and distributions of model parameters; (2) transformations on observed data to facilitate comparison; (3) visual representations of distributions; and (4) layouts to facilitate comparing model samples and observed data. We contribute an implementation of VMC as an R package. We validate VMC by reproducing a set of canonical model check examples, and show how using VMC to generate model checks reduces the edit distance between visualizations relative to existing visualization toolkits. The findings of an interview study with three expert modelers who used VMC highlight challenges and opportunities for encouraging exploration of correct, effective model check visualizations.
翻译:可视化在验证和改进统计模型中起着至关重要的作用。然而,模型检验可视化的设计空间尚未被充分理解,这使得作者难以探索和指定有效的图形化模型检验方法。VMC通过四个组件来定义模型检验可视化:(1) 从模型中生成的可检验量的分布样本,包括新数据的预测分布和模型参数的分布;(2) 为便于比较而对观测数据进行的变换;(3) 分布的可视化表示;(4) 用于促进模型样本与观测数据比较的布局。我们贡献了VMC的一个实现,即一个R软件包。我们通过复现一组经典的模型检验示例来验证VMC,并展示使用VMC生成模型检验如何相对于现有的可视化工具包减少可视化之间的编辑距离。对三位使用VMC的专家建模者进行的访谈研究结果,突显了在鼓励探索正确、有效的模型检验可视化方面所面临的挑战与机遇。