Sharp Generalization of Transductive Learning: A Transductive Local Rademacher Complexity Approach

We introduce a new tool, Transductive Local Rademacher Complexity (TLRC), to analyze the generalization performance of transductive learning methods and motivate new transductive learning algorithms. Our work extends the idea of the popular Local Rademacher Complexity (LRC) to the transductive setting with considerable changes compared to the analysis of typical LRC methods in the inductive setting. We present a localized version of Rademacher complexity based tool wihch can be applied to various transductive learning problems and gain sharp bounds under proper conditions. Similar to the development of LRC, we build TLRC by starting from a sharp concentration inequality for independent variables with variance information. The prediction function class of a transductive learning model is then divided into pieces with a sub-root function being the upper bound for the Rademacher complexity of each piece, and the variance of all the functions in each piece is limited. A carefully designed variance operator is used to ensure that the bound for the test loss on unlabeled test data in the transductive setting enjoys a remarkable similarity to that of the classical LRC bound in the inductive setting. We use the new TLRC tool to analyze the Transductive Kernel Learning (TKL) model, where the labels of test data are generated by a kernel function. The result of TKL lays the foundation for generalization bounds for two types of transductive learning tasks, Graph Transductive Learning (GTL) and Transductive Nonparametric Kernel Regression (TNKR). When the target function is low-dimensional or approximately low-dimensional, we design low rank methods for both GTL and TNKR, which enjoy particularly sharper generalization bounds by TLRC which cannot be achieved by existing learning theory methods, to the best of our knowledge.

翻译：我们引入了一种新工具——直推局部拉德马赫复杂度（TLRC），用于分析直推学习方法的泛化性能，并启发新的直推学习算法。我们的工作将流行的局部拉德马赫复杂度（LRC）思想扩展到直推设置中，与归纳设置中典型LRC方法相比做了显著调整。我们提出了一种基于拉德马赫复杂度的局部化工具版本，可应用于各种直推学习问题，并在适当条件下获得精炼的界。类似于LRC的发展，我们通过从具有方差信息的独立变量的精炼集中不等式出发，构建了TLRC。直推学习模型的预测函数类随后被划分为若干片段，每个片段的拉德马赫复杂度上界由子根函数给出，且每个片段中所有函数的方差受到限制。通过精心设计的方差算子，确保直推设置中未标记测试数据上测试损失的界与归纳设置中经典LRC界具有显著相似性。我们使用新的TLRC工具分析直推核学习（TKL）模型，其中测试数据的标签由核函数生成。TKL的结果为两类直推学习任务——图直推学习（GTL）和直推非参数核回归（TNKR）的泛化界奠定了基础。当目标函数是低维或近似低维时，我们为GTL和TNKR设计了低秩方法，这些方法通过TLRC获得了特别更精炼的泛化界，据我们所知，现有学习理论方法无法实现这一点。