Bayesian Finite Mixture Models

Finite mixture models are a useful statistical model class for clustering and density approximation. In the Bayesian framework finite mixture models require the specification of suitable priors in addition to the data model. These priors allow to avoid spurious results and provide a principled way to define cluster shapes and a preference for specific cluster solutions. A generic model estimation scheme for finite mixtures with a fixed number of components is available using Markov chain Monte Carlo (MCMC) sampling with data augmentation. The posterior allows to assess uncertainty in a comprehensive way, but component-specific posterior inference requires resolving the label switching issue. In this paper we focus on the application of Bayesian finite mixture models for clustering. We start with discussing suitable specification, estimation and inference of the model if the number of components is assumed to be known. We then continue to explain suitable strategies for fitting Bayesian finite mixture models when the number of components is not known. In addition, all steps required to perform Bayesian finite mixture modeling are illustrated on a data example where a finite mixture model of multivariate Gaussian distributions is fitted. Suitable prior specification, estimation using MCMC and posterior inference are discussed for this example assuming the number of components to be known as well as unknown.

翻译：有限混合模型是用于聚类和密度逼近的一类实用统计模型。在贝叶斯框架下，有限混合模型除了数据模型外还需要设定合适的先验分布。这些先验能够避免伪结果，并为定义聚类形状及特定聚类解的偏好提供原则性方法。对于具有固定分量数的有限混合模型，可通过数据增广的马尔可夫链蒙特卡洛（MCMC）抽样实现通用模型估计方案。后验分布允许以综合方式评估不确定性，但分量特定的后验推断需要解决标签切换问题。本文聚焦于贝叶斯有限混合模型在聚类任务中的应用。我们首先讨论当分量数量已知时模型的设定、估计与推断方法，继而阐述分量数量未知时拟合贝叶斯有限混合模型的适用策略。此外，通过多元高斯分布有限混合模型的数据实例，完整演示了执行贝叶斯有限混合建模的所有步骤。针对该实例，分别在分量数量已知与未知的假设下，详细讨论了先验设定、MCMC估计及后验推断方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日