Improved Generalization Bounds for Transductive Learning by Transductive Local Complexity and Its Applications

from arxiv, The ICML 2025 conference version (https://openreview.net/pdf?id=NRVdvg7VMn) is a special case of this paper where the chain length is fixed at 2 (i.e.,$Q=2$, see Def. 5.1), and its main results follow directly from the results here. This paper further provides a nearly optimal excess risk bound for realizable transductive learning and a stronger bound for transductive kernel learning

We introduce Transductive Local Complexity (TLC) to extend the classical Local Rademacher Complexity (LRC) to the transductive setting, incorporating substantial and novel components. Although LRC has been used to obtain sharp generalization bounds and minimax rates for inductive tasks such as classification and nonparametric regression, it has remained an open problem whether a localized Rademacher complexity framework can be effectively adapted to transductive learning to achieve sharp or nearly sharp bounds consistent with inductive results. We provide an affirmative answer via TLC. TLC is constructed by first deriving a new concentration inequality in Theorem 4.1 for the supremum of empirical processes capturing the gap between test and training losses, termed the test-train process, under uniform sampling without replacement, which leverages a novel combinatorial property of the test-train process and a new proof strategy applying the exponential Efron-Stein inequality twice. A subsequent peeling strategy applied to a new decomposition of the expectation of the test-train process and a new surrogate variance operator then yield excess risk bounds in the transductive setting that are nearly consistent with classical LRC-based inductive bounds up to a logarithmic gap. We further advance transductive learning through two applications: (1) for realizable transductive learning over binary-valued classes with finite VC dimension of $\dVC$ and $u \ge m \ge \dVC$, where $u$ and $m$ are the number of test features and training features, our Theorem 6.1 gives a nearly optimal bound $Θ(\dVC \log(me/\dVC)/m)$ matching the minimax rate $Θ(\dVC/m)$ up to $\log m$, resolving a decade-old open question; and (2) Theorem 6.2 presents a sharper excess risk bound for transductive kernel learning compared to the current state-of-the-art.

翻译：我们引入转导局部复杂度（TLC），将经典的局部Rademacher复杂度（LRC）扩展到转导学习场景，并融入了大量新颖的组成部分。尽管LRC已被用于为分类和非参数回归等归纳任务获得尖锐的泛化界和极小极大速率，但局部化Rademacher复杂度框架能否有效适应转导学习以获得与归纳结果一致的尖锐或近乎尖锐的界，一直是一个悬而未决的问题。我们通过TLC给出了肯定的答案。TLC的构建首先基于定理4.1推导出一个新的集中不等式，该不等式针对捕获测试损失与训练损失之间差距的经验过程（称为测试-训练过程）的上确界，该过程在无放回均匀抽样下进行。该推导利用了测试-训练过程的一个新颖的组合性质，以及一种两次应用指数型Efron-Stein不等式的新的证明策略。随后，通过将一种剥离策略应用于对测试-训练过程期望值的新分解以及一个新的代理方差算子，我们得到了转导场景下的超额风险界，该界与基于经典LRC的归纳界在仅相差一个对数因子的意义上近乎一致。我们通过两个应用进一步推进了转导学习的研究：(1) 对于在有限VC维 $\dVC$ 且满足 $u \ge m \ge \dVC$ 的二值函数类上的可实现转导学习（其中 $u$ 和 $m$ 分别是测试特征和训练特征的数量），我们的定理6.1给出了一个近乎最优的界 $Θ(\dVC \log(me/\dVC)/m)$，该界与极小极大速率 $Θ(\dVC/m)$ 在相差 $\log m$ 的意义上匹配，从而解决了一个存在十年之久的开放性问题；(2) 定理6.2为转导核学习提出了一个比当前最优结果更尖锐的超额风险界。