Conditional Value-at-Risk (CVaR) is a widely used risk-sensitive objective for learning under rare but high-impact losses, yet its statistical behavior under heavy-tailed data remains poorly understood. Unlike expectation-based risk, CVaR depends on an endogenous, data-dependent quantile, which couples tail averaging with threshold estimation and fundamentally alters both generalization and robustness properties. In this work, we develop a learning-theoretic analysis of CVaR-based empirical risk minimization under heavy-tailed and contaminated data. We establish sharp, high-probability generalization and excess risk bounds under minimal moment assumptions, covering fixed hypotheses, finite and infinite classes, and extending to $β$-mixing dependent data; we further show that these rates are minimax optimal. To capture the intrinsic quantile sensitivity of CVaR, we derive a uniform Bahadur-Kiefer type expansion that isolates a threshold-driven error term absent in mean-risk ERM and essential in heavy-tailed regimes. We complement these results with robustness guarantees by proposing a truncated median-of-means CVaR estimator that achieves optimal rates under adversarial contamination. Finally, we show that CVaR decisions themselves can be intrinsically unstable under heavy tails, establishing a fundamental limitation on decision robustness even when the population optimum is well separated. Together, our results provide a principled characterization of when CVaR learning generalizes and is robust, and when instability is unavoidable due to tail scarcity.
翻译:条件风险价值(CVaR)是一种广泛使用的风险敏感目标函数,用于在罕见但高影响损失场景下的学习任务,然而其在重尾数据下的统计特性仍鲜为人知。与基于期望的风险度量不同,CVaR依赖于一个内生的、数据依赖的分位数,这将尾部平均与阈值估计耦合起来,从根本上改变了泛化性与鲁棒性特性。本文针对重尾和污染数据下基于CVaR的经验风险最小化方法,建立了学习理论分析框架。我们在最小矩假设条件下,建立了精确的高概率泛化界与超额风险界,涵盖固定假设、有限与无限假设类,并推广至$β$-混合相依数据;进一步证明这些速率是极小极大最优的。为捕捉CVaR固有的分位数敏感性,我们推导了均匀Bahadur-Kiefer型展开式,分离出均值风险ERM中不存在但对重尾机制至关重要的阈值驱动误差项。我们通过提出截断均值中位数CVaR估计量来补充这些结果的鲁棒性保证,该估计量在对抗性污染下达到最优速率。最后,我们证明CVaR决策本身在重尾条件下可能存在固有失稳性,这确立了即使总体最优解具有良好分离性时决策鲁棒性的根本局限。综上,我们的研究从原理上刻画了CVaR学习何时具有泛化性与鲁棒性,以及何时因尾部稀缺性而不可避免地出现失稳现象。