Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate measurements), tori (dihedral angles in proteins), or Lie groups (attitude in navigation). By leveraging optimal transport theory and the notion of $c$-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets. We demonstrate the approach's efficacy and provide insights regarding the meaning of non-Euclidean quantiles through preliminary synthetic data experiments.
翻译:分位数回归是一种无分布假设的统计工具,用于在给定解释变量条件下估计目标变量的条件分位数。其局限性在于假设目标分布为单变量且定义在欧几里得域上。尽管分位数的概念近期已被推广至多变量分布,但在流形上多变量分布的分位数回归仍鲜有探索——尽管许多重要应用场景中数据天然分布于球面(如气候测量数据)、环面(如蛋白质二面角)或李群(如导航姿态)。通过利用最优传输理论与$c$-凹函数的概念,我们定义了流形上高维变量的条件向量分位数函数(M-CVQFs)。该方法可实现分位数估计、回归及条件置信集计算。通过初步合成数据实验,我们验证了方法的有效性,并深入阐释了非欧几里得分位数的内涵。