This paper introduces a novel Approximate Bayesian Computation (ABC) framework for estimating the posterior distribution and the maximum likelihood estimate (MLE) of the parameters of models defined by intractable likelihood functions. This framework can describe the possibly skewed and high dimensional posterior distribution by a novel multivariate copula-based distribution, based on univariate marginal posterior distributions which can account for skewness and be accurately estimated by Distribution Random Forests (DRF) while performing automatic summary statistics (covariates) selection, and on robustly estimated copula dependence parameters. The framework employs a novel multivariate mode estimator to perform for MLE and posterior mode estimation, and provides an optional step to perform model selection from a given set of models with posterior probabilities estimated by DRF. The posterior distribution estimation accuracy of the ABC framework is illustrated through simulation studies involving models with analytically computable posterior distributions, and involving exponential random graph and mechanistic network models which are each defined by an intractable likelihood from which it is costly to simulate large network datasets. Also, the framework is illustrated through analyses of large real-life networks of sizes ranging between 28,000 to 65.6 million nodes (between 3 million to 1.8 billion edges), including a large multilayer network with weighted directed edges.
翻译:本文提出了一种新颖的近似贝叶斯计算(ABC)框架,用于估计由不可解似然函数定义模型的参数后验分布和最大似然估计(MLE)。该框架通过一种新颖的基于多元Copula的分布来描述可能呈现偏态和高维的后验分布,该分布基于单变量边际后验分布——这些边际分布可以处理偏态性,并通过分布随机森林(DRF)在自动进行汇总统计量(协变量)选择的同时实现精确估计——以及稳健估计的Copula依赖参数。框架采用新颖的多元众数估计器执行MLE和后验众数估计,并提供一个可选步骤,利用DRF估计的后验概率从给定模型集中进行模型选择。通过涉及具有解析计算后验分布的模型、以及指数随机图模型和机理网络模型(两者均由不可解似然定义,且模拟大规模网络数据集的成本高昂)的仿真研究,验证了该ABC框架的后验分布估计精度。此外,通过对节点规模在28,000至6,560万之间(边数在300万至18亿之间)的大型真实网络(包括一个具有加权有向边的多层网络)进行分析,进一步展示了该框架的应用。