Constructing minimum-volume prediction regions that satisfy conditional coverage is a fundamental challenge in multivariate regression. Standard approaches rely on explicitly estimating the full conditional density and subsequently thresholding it. This two-step plug-in process is notoriously difficult, sensitive to estimation errors, and computationally expensive. One would like to instead optimize the region directly. Formulating a direct solution is challenging, however, because it requires minimizing a volume objective that is coupled with the conditional quantiles of the model's own estimation error. In this work, we address this challenge. We introduce super-level-set regression (SLS), a novel mathematical framework that successfully resolves this implicit coupling, allowing us to directly parameterize and optimize the geometric boundaries of the target conditional level sets. By bypassing full distribution estimation and leveraging flexible volume-preserving frontier functions, our approach natively captures complex, multimodal, and disjoint conditional structures end-to-end. Ultimately, SLS offers a new perspective on multivariate conditional quantile regression, replacing the restrictive assumptions of density-first methods with a direct geometric optimization strategy.
翻译:构建满足条件覆盖的最小体积预测区域是多变量回归中的一个基本挑战。标准方法依赖于显式估计完整的条件密度并随后对其进行阈值化。这种两步插件过程公认困难,对估计误差敏感且计算成本高昂。因此,人们更希望直接优化该区域。然而,制定直接解决方案颇具挑战性,因为它需要最小化一个与模型自身估计误差的条件分位数相耦合的体积目标。本文中,我们解决了这一挑战。我们引入了超水平集回归(SLS),一种新颖的数学框架,成功解开了这种隐式耦合,使我们能够直接参数化并优化目标条件水平集的几何边界。通过绕过完整分布估计并利用灵活的体积保持前沿函数,我们的方法原生地以端到端方式捕捉复杂、多模态且不连通的条结构。最终,SLS为多变量条件分位数回归提供了新视角,用直接的几何优化策略替代了密度优先方法中的限制性假设。