Bayesian hierarchical models are commonly employed for inference in count datasets, as they account for multiple levels of variation by incorporating prior distributions for parameters at different levels. Examples include Beta-Binomial, Negative-Binomial (NB), Dirichlet-Multinomial (DM) distributions. In this paper, we address two crucial challenges that arise in various Bayesian count models: inference for the concentration parameter in the ratio of Gamma functions and the inability of these models to effectively handle excessive zeros and small nonzero counts. We propose a novel class of prior distributions that facilitates conjugate updating of the concentration parameter in Gamma ratios, enabling full Bayesian inference for the aforementioned count distributions. We use DM models as our running examples. Our methodology leverages fast residue computation and admits closed-form posterior moments. Additionally, we recommend a default horseshoe type prior which has a heavy tail and substantial mass around zero. It admits continuous shrinkage, making the posterior highly adaptable to sparsity or quasi-sparsity in the data. Furthermore, we offer insights and potential generalizations to other count models facing the two challenges. We demonstrate the usefulness of our approach on both simulated examples and on real-world applications. Finally, we conclude with directions for future research.
翻译:贝叶斯分层模型常用于计数数据集的推断,因其通过在不同层级引入参数的先验分布,能够捕捉多层次的变异。典型例子包括Beta-二项分布、负二项分布(NB)以及Dirichlet-多项式分布(DM)。本文针对各类贝叶斯计数模型中普遍存在的两个关键挑战展开研究:Gamma函数比值中浓度参数的推断问题,以及这些模型在处理过量零值与较小非零计数时的局限性。我们提出一类新颖的先验分布,该分布能够实现Gamma比值中浓度参数的共轭更新,从而为前述计数分布提供完整的贝叶斯推断框架。本文以DM模型作为贯穿始终的示例。我们的方法利用快速留数计算技术,并得到闭式后验矩。此外,我们推荐采用具有重尾特征且在零点附近聚集显著概率质量的默认马蹄型先验。该先验允许连续收缩,使得后验分布能高度适应数据中的稀疏性或准稀疏性。进一步地,我们针对面临这两类挑战的其他计数模型提出了见解与可能的推广方案。通过模拟实验和实际应用案例,我们验证了所提方法的有效性。最后,我们展望了未来的研究方向。