Consistent Group selection using Global-local prior in High dimensional setup

We consider the problem of model selection when grouping structure is inherent within the regressors. Using a Bayesian approach, we model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. In the context of variable selection, this class of priors was studied by Tang et al. (2018) \cite{tang2018bayesian}. A modified form of the usual class of global-local shrinkage priors with polynomial tail on the group regression coefficients is proposed. The resulting threshold rule selects the active group if within a group, the ratio of the $L_2$ norm of the posterior mean of its group coefficient to that of the corresponding ordinary least square group estimate is greater than a half. In the theoretical part of this article, we have used the global shrinkage parameter either as a tuning one or an empirical Bayes estimate of it depending on the knowledge regarding the underlying sparsity of the model. When the proportion of active groups is known, using $\tau$ as a tuning parameter, we have proved that our method enjoys variable selection consistency. In case this proportion is unknown, we propose an empirical Bayes estimate of $\tau$. Even if this empirical Bayes estimate is used, then also our half-thresholding rule captures the true sparse group structure. Though our theoretical works rely on a special form of the design matrix, but for general design matrices also, our simulation results show that the half-thresholding rule yields results similar to that of Yang and Narisetty (2020) \cite{yang2020consistent}. As a consequence of this, in a high dimensional sparse group selection problem, instead of using the so-called `gold standard' spike and slab prior, one can use the one-group global-local shrinkage priors with polynomial tail to obtain similar results.

翻译：我们考虑当回归变量中存在固有分组结构时的模型选择问题。采用贝叶斯方法，我们通过属于一大类包含马蹄铁先验的全局-局部收缩先验来对均值向量建模。在变量选择背景下，该类先验由Tang等人（2018）\cite{tang2018bayesian}研究。本文提出了一种改进的常见全局-局部收缩先验形式，该先验在组回归系数上具有多项式尾部。所得阈值规则选择活跃组的标准是：若某组内，其后验均值组系数的$L_2$范数与对应普通最小二乘组估计值之比大于二分之一，则判定该组为活跃组。在本文理论部分，我们根据对模型潜在稀疏性的了解，将全局收缩参数用作调节参数或经验贝叶斯估计值。当已知活跃组比例时，使用$\tau$作为调节参数，我们证明了该方法具有变量选择一致性。若该比例未知，我们提出一种$\tau$的经验贝叶斯估计。即使使用该经验贝叶斯估计，我们的半阈值规则仍能捕获真实稀疏组结构。尽管我们的理论研究依赖于设计矩阵的特殊形式，但对于一般设计矩阵，模拟结果表明半阈值规则能产生与Yang和Narisetty（2020）\cite{yang2020consistent}相似的结果。基于此，在高维稀疏组选择问题中，无需使用所谓的“黄金标准”尖峰-板先验，而可采用具有多项式尾部的单组全局-局部收缩先验获得类似结果。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日