This paper proposes a new approach to estimating the distribution of a response variable conditioned on observing some factors. The proposed approach possesses desirable properties of flexibility, interpretability, tractability and extendability. The conditional quantile function is modeled by a mixture (weighted sum) of basis quantile functions, with the weights depending on factors. The calibration problem is formulated as a convex optimization problem. It can be viewed as conducting quantile regressions for all confidence levels simultaneously while avoiding quantile crossing by definition. The calibration problem is equivalent to minimizing the continuous ranked probability score (CRPS). Based on the canonical polyadic (CP) decomposition of tensors, we propose a dimensionality reduction method that reduces the rank of the parameter tensor and propose an alternating algorithm for estimation. Additionally, based on Risk Quadrangle framework, we generalize the approach to conditional distributions defined by Conditional Value-at-Risk (CVaR), expectile and other functions of uncertainty measures. Although this paper focuses on using splines as the weight functions, it can be extended to neural networks. Numerical experiments demonstrate the effectiveness of our approach.
翻译:本文提出了一种估计观测因子条件下响应变量分布的新方法。该方法具有灵活性、可解释性、可处理性和可扩展性等理想特性。条件分位数函数通过基础分位数函数的混合(加权和)进行建模,其中权重取决于因子。校准问题被表述为一个凸优化问题。该方法可视为同时对所有置信水平进行分位数回归,同时从根本上避免了分位数交叉。校准问题等价于最小化连续排序概率分数(CRPS)。基于张量的典型多路(CP)分解,我们提出了一种降低参数张量秩的降维方法,并设计了一种交替估计算法。此外,基于风险四边形框架,我们将该方法推广到由条件风险价值(CVaR)、期望损失及其他不确定性度量函数定义的条件分布。虽然本文重点采用样条函数作为权重函数,但该方法可扩展到神经网络。数值实验证明了该方法的有效性。