While statistical modeling of distributional data has gained increased attention, the case of multivariate distributions has been somewhat neglected despite its relevance in various applications. This is because the Wasserstein distance that is commonly used in distributional data analysis poses challenges for multivariate distributions. A promising alternative is the sliced Wasserstein distance, which offers a computationally simpler solution. We propose distributional regression models with multivariate distributions as responses paired with Euclidean vector predictors, working with the sliced Wasserstein distance, which is based on a slicing transform from the multivariate distribution space to the sliced distribution space. We introduce two regression approaches, one based on utilizing the sliced Wasserstein distance directly in the multivariate distribution space, and a second approach that employs a univariate distribution regression for each slice. We develop both global and local Fr\'echet regression methods for these approaches and establish asymptotic convergence for sample-based estimators. The proposed regression methods are illustrated in simulations and by studying joint distributions of systolic and diastolic blood pressure as a function of age and joint distributions of excess winter death rates and winter temperature anomalies in European countries as a function of a country's base winter temperature.
翻译:尽管分布数据的统计建模日益受到关注,但多变量分布的情况却因其在各类应用中的相关性而有所忽视。这是因为分布数据分析中常用的Wasserstein距离对多变量分布带来了挑战。一个颇有前景的替代方案是切片Wasserstein距离,它提供了计算上更为简便的解决方案。我们提出了以多变量分布为响应、欧几里得向量为预测变量的分布回归模型,采用基于从多变量分布空间到切片分布空间的切片变换的切片Wasserstein距离。我们引入了两种回归方法:一种直接在多变量分布空间中使用切片Wasserstein距离,另一种则对每个切片采用单变量分布回归。我们为这些方法开发了全局和局部Fr'echet回归方法,并建立了基于样本的估计量的渐近收敛性。通过模拟以及研究收缩压和舒张压联合分布随年龄的变化,以及欧洲国家超额冬季死亡率与冬季温度异常联合分布随国家基准冬季温度的变化,对所提出的回归方法进行了阐述。