Deep learning models are being used for the analysis of parametric statistical models based on simulation-only frameworks. Bayesian models using normalizing flows simulate data from a prior distribution and are composed of two deep neural networks: a summary network that learns a sufficient statistic for the parameter and a normalizing flow that conditional on the summary network can approximate the posterior distribution. Here, we explore frequentist models that are based on a single summary network. During training, input of the network is a simulated data set based on a parameter and the loss function minimizes the mean-square error between learned summary and parameter. The network thereby solves the inverse problem of parameter estimation. We propose a branched network structure that contains collapsing layers that reduce a data set to summary statistics that are further mapped through fully connected layers to approximate the parameter estimate. We motivate our choice of network structure by theoretical considerations. In simulations we demonstrate three desirable properties of parameter estimates: finite sample exactness, robustness to data contamination, and algorithm approximation. These properties are achieved offering the the network varying sample size, contaminated data, and data needing algorithmic reconstruction during the training phase. In our simulations an EM-algorithm for genetic data is automatically approximated by the network. Simulation only approaches seem to offer practical advantages in complex modeling tasks where the simpler data simulation part is left to the researcher and the more complex problem of solving the inverse problem is left to the neural network. Challenging future work includes offering pre-trained models that can be used in a wide variety of applications.
翻译:深度学习模型正被用于基于纯仿真框架的参数化统计模型分析。使用归一化流的贝叶斯模型从先验分布中模拟数据,并由两个深度神经网络构成:一个学习参数充分统计量的摘要网络,以及一个在给定摘要网络条件下能够近似后验分布的归一化流。本文探索基于单一摘要网络的频率派模型。训练期间,网络的输入是基于参数生成的模拟数据集,损失函数最小化学习摘要与参数之间的均方误差。网络由此解决了参数估计的逆问题。我们提出一种分支网络结构,其中包含将数据集降维为摘要统计量的坍缩层,这些统计量进一步通过全连接层映射以近似参数估计。我们通过理论论证说明了网络结构设计的依据。在仿真实验中,我们证明了参数估计具备三个理想特性:有限样本精确性、数据污染鲁棒性以及算法近似能力。这些特性是通过在训练阶段向网络提供不同样本量、污染数据及需要算法重建的数据而实现的。在我们的仿真中,遗传数据的EM算法被网络自动近似。纯仿真方法在复杂建模任务中展现出实用优势:研究者仅需负责较简单的数据仿真部分,而将求解逆问题这一更复杂的任务交由神经网络处理。未来的挑战性工作包括提供可广泛应用于各种场景的预训练模型。