CAIRO: Decoupling Order from Scale in Regression

Standard regression methods typically optimize a single pointwise objective, such as mean squared error, which conflates the learning of ordering with the learning of scale. This coupling renders models vulnerable to outliers and heavy-tailed noise. We propose CAIRO (Calibrate After Initial Rank Ordering), a framework that decouples regression into two distinct stages. In the first stage, we learn a scoring function by minimizing a scale-invariant ranking loss; in the second, we recover the target scale via isotonic regression. We theoretically characterize a class of "Optimal-in-Rank-Order" objectives -- including variants of RankNet and Gini covariance -- and prove that they recover the ordering of the true conditional mean under mild assumptions. We further show that subsequent monotone calibration guarantees recovery of the true regression function. Empirically, CAIRO combines the representation learning of neural networks with the robustness of rank-based statistics. It matches the performance of state-of-the-art tree ensembles on tabular benchmarks and significantly outperforms standard regression objectives in regimes with heavy-tailed or heteroskedastic noise.

翻译：标准回归方法通常优化单一逐点目标（如均方误差），这混淆了顺序学习与尺度学习。这种耦合使模型易受异常值和重尾噪声的影响。我们提出CAIRO（初始排序后校准）框架，将回归解耦为两个独立阶段：第一阶段通过最小化尺度不变的排序损失来学习评分函数；第二阶段通过保序回归恢复目标尺度。我们从理论上刻画了一类“秩序最优”目标函数（包括RankNet变体和基尼协方差），并证明其在温和假设下能恢复真实条件均值的排序。进一步证明后续单调校准可保证真实回归函数的恢复。实证表明，CAIRO结合了神经网络的表征学习能力与基于秩统计的鲁棒性，在表格数据基准测试中与最先进的树集成方法性能相当，并在重尾或异方差噪声场景中显著优于标准回归目标。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

论学习、公平性与复杂度

专知会员服务

11+阅读 · 2月28日

【IJCAI2025教程】基于梯度的多目标深度学习，221页ppt

专知会员服务

24+阅读 · 2025年8月31日

《子空间学习机 (SLM)：一种新的分类和回归方法》2022最新35页技术报告，美陆军研究实验室

专知会员服务

31+阅读 · 2022年11月28日

【CMU-Yuejie Chi等干货书】满足低秩矩阵分解的非凸优化综述，69页pdf，Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

专知会员服务

33+阅读 · 2022年3月4日