Structured Mixture of Continuation-ratio Logits Models for Ordinal Regression

We develop a nonparametric Bayesian modeling approach to ordinal regression based on priors placed directly on the discrete distribution of the ordinal responses. The prior probability models are built from a structured mixture of multinomial distributions. We leverage a continuation-ratio logits representation to formulate the mixture kernel, with mixture weights defined through the logit stick-breaking process that incorporates the covariates through a linear function. The implied regression functions for the response probabilities can be expressed as weighted sums of parametric regression functions, with covariate-dependent weights. Thus, the modeling approach achieves flexible ordinal regression relationships, avoiding linearity or additivity assumptions in the covariate effects. Model flexibility is formally explored through the Kullback-Leibler support of the prior probability model. A key model feature is that the parameters for both the mixture kernel and the mixture weights can be associated with a continuation-ratio logits regression structure. Hence, an efficient and relatively easy to implement posterior simulation method can be designed, using P\'olya-Gamma data augmentation. Moreover, the model is built from a conditional independence structure for category-specific parameters, which results in additional computational efficiency gains through partial parallel sampling. In addition to the general mixture structure, we study simplified model versions that incorporate covariate dependence only in the mixture kernel parameters or only in the mixture weights. For all proposed models, we discuss approaches to prior specification and develop Markov chain Monte Carlo methods for posterior simulation. The methodology is illustrated with several synthetic and real data examples.

翻译：我们提出一种基于非参数贝叶斯建模的序数回归方法，该方法将先验直接置于序数响应的离散分布之上。先验概率模型通过多项分布的结构化混合构建。我们利用续比对数几率表示来构建混合核函数，其中混合权重通过包含协变量线性函数的对数几率棍子断裂过程定义。响应概率的隐含回归函数可表示为含协变量依赖权重的参数回归函数的加权和。因此，该建模方法实现了灵活的序数回归关系，避免了协变量效应中的线性或可加性假设。通过先验概率模型的库尔贝克-莱布勒支撑集，正式探讨了模型的灵活性。模型的关键特征在于混合核函数与混合权重的参数均可与续比对数几率回归结构关联。由此，可采用Pólya-Gamma数据增广法设计高效且易于实现的后验模拟方法。此外，模型基于类别特定参数的条件独立结构构建，通过部分并行采样进一步获得计算效率提升。除通用混合结构外，我们还研究了仅在混合核参数或仅在混合权重中引入协变量依赖的简化模型版本。针对所有提出的模型，我们讨论了先验规范方法，并开发了用于后验模拟的马尔可夫链蒙特卡洛方法。该方法通过多个合成数据与真实数据实例进行验证。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日