Generalized additive models for location, scale and shape (GAMLSS) are a popular extension to mean regression models where each parameter of an arbitrary distribution is modelled through covariates. While such models have been developed for univariate and bivariate responses, the truly multivariate case remains extremely challenging for both computational and theoretical reasons. Alternative approaches to GAMLSS may allow for higher dimensional response vectors to be modelled jointly but often assume a fixed dependence structure not depending on covariates or are limited with respect to modelling flexibility or computational aspects. We contribute to this gap in the literature and propose a truly multivariate distributional model, which allows one to benefit from the flexibility of GAMLSS even when the response has dimension larger than two or three. Building on copula regression, we model the dependence structure of the response through a Gaussian copula, while the marginal distributions can vary across components. Our model is highly parameterized but estimation becomes feasible with Bayesian inference employing shrinkage priors. We demonstrate the competitiveness of our approach in a simulation study and illustrate how it complements existing models along the examples of childhood malnutrition and a yet unexplored data set on traffic detection in Berlin.
翻译:位置、尺度和形状的广义可加模型(GAMLSS)是均值回归模型的一种流行扩展,其中任意分布的每个参数通过协变量进行建模。尽管此类模型已针对单变量和双变量响应变量进行了开发,但真正的多变量情形在计算和理论两方面仍然极具挑战性。GAMLSS的替代方法可能允许对高维响应向量进行联合建模,但通常假设固定的依赖结构(不依赖于协变量),或者在建模灵活性和计算方面存在局限性。我们针对这一文献空白做出贡献,提出了一种真正的多变量分布模型,即使在响应变量维度大于二或三的情况下,也能受益于GAMLSS的灵活性。基于连接函数回归,我们通过高斯连接函数对响应的依赖结构进行建模,而边际分布可在各分量间变化。我们的模型高度参数化,但通过采用收缩先验的贝叶斯推断,估计变得可行。我们通过模拟研究证明了所提方法的竞争力,并结合儿童营养不良和柏林交通检测中尚未探索的数据集示例,阐明了该方法如何补充现有模型。