Most works studying representation learning focus only on classification and neglect regression. Yet, the learning objectives and, therefore, the representation topologies of the two tasks are fundamentally different: classification targets class separation, leading to disconnected representations, whereas regression requires ordinality with respect to the target, leading to continuous representations. We thus wonder how the effectiveness of a regression representation is influenced by its topology, with evaluation based on the Information Bottleneck (IB) principle. The IB principle is an important framework that provides principles for learning effective representations. We establish two connections between it and the topology of regression representations. The first connection reveals that a lower intrinsic dimension of the feature space implies a reduced complexity of the representation Z. This complexity can be quantified as the conditional entropy of Z on the target Y, and serves as an upper bound on the generalization error. The second connection suggests a feature space that is topologically similar to the target space will better align with the IB principle. Based on these two connections, we introduce PH-Reg, a regularizer specific to regression that matches the intrinsic dimension and topology of the feature space with the target space. Experiments on synthetic and real-world regression tasks demonstrate the benefits of PH-Reg. Code: https://github.com/needylove/PH-Reg.
翻译:大多数表示学习研究仅关注分类任务而忽视了回归任务。然而,这两个任务的学习目标及其表示拓扑结构存在本质差异:分类旨在实现类别分离,导致非连通的表示;而回归要求保持与目标相关的序关系,形成连续表示。因此,我们探究回归表示的有效性如何受其拓扑影响,并基于信息瓶颈(IB)原则进行评估。IB原则是提供有效表示学习原则的重要框架,我们建立了该原则与回归表示拓扑的两类联系。第一类联系表明,特征空间的固有维度越低,表示Z的复杂度越小——该复杂度可量化为在目标Y条件下Z的条件熵,并构成泛化误差的上界。第二类联系指出,与目标空间拓扑相似的特征空间能更好地契合IB原则。基于这两类联系,我们提出PH-Reg——一种专用于回归的正则化方法,使特征空间的固有维度和拓扑结构与目标空间相匹配。在合成数据集和真实回归任务上的实验验证了PH-Reg的有效性。代码:https://github.com/needylove/PH-Reg。