Mixing (or prior) density estimation is an important problem in machine learning and statistics, especially in empirical Bayes $g$-modeling where accurately estimating the prior is necessary for making good posterior inferences. In this paper, we propose neural-$g$, a new neural network-based estimator for $g$-modeling. Neural-$g$ uses a softmax output layer to ensure that the estimated prior is a valid probability density. Under default hyperparameters, we show that neural-$g$ is very flexible and capable of capturing many unknown densities, including those with flat regions, heavy tails, and/or discontinuities. In contrast, existing methods struggle to capture all of these prior shapes. We provide justification for neural-$g$ by establishing a new universal approximation theorem regarding the capability of neural networks to learn arbitrary probability mass functions. To accelerate convergence of our numerical implementation, we utilize a weighted average gradient descent approach to update the network parameters. Finally, we extend neural-$g$ to multivariate prior density estimation. We illustrate the efficacy of our approach through simulations and analyses of real datasets. A software package to implement neural-$g$ is publicly available at https://github.com/shijiew97/neuralG.
翻译:混合(或先验)密度估计是机器学习与统计学中的重要问题,尤其在经验贝叶斯$g$建模中,准确估计先验分布对于获得良好的后验推断至关重要。本文提出神经-$g$,一种基于神经网络的全新$g$建模估计器。神经-$g$采用softmax输出层以确保估计的先验分布是有效的概率密度函数。在默认超参数设置下,我们证明神经-$g$具有高度灵活性,能够捕捉多种未知密度形式,包括具有平坦区域、重尾和/或不连续性的分布。相比之下,现有方法难以完整刻画所有这些先验分布形态。我们通过建立关于神经网络学习任意概率质量函数能力的新泛逼近定理,为神经-$g$提供了理论依据。为加速数值实现的收敛,我们采用加权平均梯度下降法更新网络参数。最后,我们将神经-$g$扩展至多元先验密度估计。通过模拟实验和真实数据集分析,我们验证了该方法的有效性。神经-$g$的软件实现已公开于https://github.com/shijiew97/neuralG。