While statistical modeling of distributional data has gained increased attention, the case of multivariate distributions has been somewhat neglected despite its relevance in various applications. This is because the Wasserstein distance, commonly used in distributional data analysis, poses challenges for multivariate distributions. A promising alternative is the sliced Wasserstein distance, which offers a computationally simpler solution. We propose distributional regression models with multivariate distributions as responses paired with Euclidean vector predictors. The foundation of our methodology is a slicing transform from the multivariate distribution space to the sliced distribution space for which we establish a theoretical framework, with the Radon transform as a prominent example. We introduce and study the asymptotic properties of sample-based estimators for two regression approaches, one based on utilizing the sliced Wasserstein distance directly in the multivariate distribution space, and a second approach based on a new slice-wise distance, employing a univariate distribution regression for each slice. Both global and local Fr\'echet regression methods are deployed for these approaches and illustrated in simulations and through applications. These include joint distributions of excess winter death rates and winter temperature anomalies in European countries as a function of base winter temperature and also data from finance.
翻译:尽管分布数据的统计建模已受到越来越多的关注,但多变量分布的情况却因其在各种应用中的相关性而有所忽视。这是因为分布数据分析中常用的Wasserstein距离对多变量分布提出了挑战。一个有前景的替代方案是切片Wasserstein距离,它提供了计算上更简单的解决方案。我们提出了以多变量分布作为响应变量、欧几里得向量作为预测变量的分布回归模型。该方法的基础是从多变量分布空间到切片分布空间的切片变换,我们为此建立了理论框架,并以Radon变换作为典型示例。我们引入并研究了两种回归方法的基于样本的估计量的渐近性质:一种方法直接在多变量分布空间中利用切片Wasserstein距离,另一种方法则基于新的逐片距离,对每个切片采用单变量分布回归。针对这些方法,我们部署了全局和局部Fr\'echet回归方法,并通过模拟和应用进行说明。应用包括欧洲国家冬季超额死亡率与冬季温度异常的联合分布作为基准冬季温度的函数,以及金融领域的数据。