We introduce a Bayesian framework for mixed-type multivariate regression using continuous shrinkage priors. Our framework enables joint analysis of mixed continuous and discrete outcomes and facilitates variable selection from the $p$ covariates. Theoretical studies of Bayesian mixed-type multivariate response models have not been conducted previously and require more intricate arguments than the corresponding theory for univariate response models due to the correlations between the responses. In this paper, we investigate necessary and sufficient conditions for posterior contraction of our method when $p$ grows faster than sample size $n$. The existing literature on Bayesian high-dimensional asymptotics has focused only on cases where $p$ grows subexponentially with $n$. In contrast, we study the asymptotic regime where $p$ is allowed to grow exponentially in terms of $n$. We develop a novel two-step approach for variable selection which possesses the sure screening property and provably achieves posterior contraction even under exponential growth of $p$. We demonstrate the utility of our method through simulation studies and applications to real data, including a cancer genomics dataset where $n=174$ and $p=9183$. The R code to implement our method is available at https://github.com/raybai07/MtMBSP.
翻译:本文提出了一种利用连续收缩先验的混合型多元回归贝叶斯框架。该框架能够对连续与离散混合型响应变量进行联合分析,并实现从$p$个协变量中进行变量选择。由于响应变量间存在相关性,贝叶斯混合型多元响应模型的理论研究相较于单变量响应模型需要更复杂的论证,此前尚未有相关理论研究。本文研究了当协变量维度$p$的增速超过样本量$n$时,所提方法获得后验收缩的充分必要条件。现有贝叶斯高维渐近理论文献仅关注$p$相对于$n$呈次指数增长的情形,而本文探讨了允许$p$相对于$n$呈指数增长的渐近体系。我们提出了一种新颖的两步变量选择方法,该方法具备确定筛选性质,并能在$p$呈指数增长的情况下严格实现后验收缩。通过模拟研究及实际数据应用(包括样本量$n=174$、协变量维度$p=9183$的癌症基因组数据集)验证了方法的有效性。实现本方法的R代码发布于https://github.com/raybai07/MtMBSP。