Pochhammer Priors for Sparse Count Models

Bayesian hierarchical models are commonly employed for inference in count datasets, as they account for multiple levels of variation by incorporating prior distributions for parameters at different levels. Examples include Beta-Binomial, Negative-Binomial (NB), Dirichlet-Multinomial (DM) distributions. In this paper, we address two crucial challenges that arise in various Bayesian count models: inference for the concentration parameter in the ratio of Gamma functions and the inability of these models to effectively handle excessive zeros and small nonzero counts. We propose a novel class of prior distributions that facilitates conjugate updating of the concentration parameter in Gamma ratios, enabling full Bayesian inference for the aforementioned count distributions. We use DM models as our running examples. Our methodology leverages fast residue computation and admits closed-form posterior moments. Additionally, we recommend a default horseshoe type prior which has a heavy tail and substantial mass around zero. It admits continuous shrinkage, making the posterior highly adaptable to sparsity or quasi-sparsity in the data. Furthermore, we offer insights and potential generalizations to other count models facing the two challenges. We demonstrate the usefulness of our approach on both simulated examples and on real-world applications. Finally, we conclude with directions for future research.

翻译：贝叶斯分层模型常用于计数数据集的推断，因其通过在不同层级引入参数的先验分布，能够捕捉多层次的变异。典型例子包括Beta-二项分布、负二项分布（NB）以及Dirichlet-多项式分布（DM）。本文针对各类贝叶斯计数模型中普遍存在的两个关键挑战展开研究：Gamma函数比值中浓度参数的推断问题，以及这些模型在处理过量零值与较小非零计数时的局限性。我们提出一类新颖的先验分布，该分布能够实现Gamma比值中浓度参数的共轭更新，从而为前述计数分布提供完整的贝叶斯推断框架。本文以DM模型作为贯穿始终的示例。我们的方法利用快速留数计算技术，并得到闭式后验矩。此外，我们推荐采用具有重尾特征且在零点附近聚集显著概率质量的默认马蹄型先验。该先验允许连续收缩，使得后验分布能高度适应数据中的稀疏性或准稀疏性。进一步地，我们针对面临这两类挑战的其他计数模型提出了见解与可能的推广方案。通过模拟实验和实际应用案例，我们验证了所提方法的有效性。最后，我们展望了未来的研究方向。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日