Label Distribution Learning (LDL) assigns soft labels, a.k.a. degrees, to a sample. In reality, it is always laborious to obtain complete degrees, giving birth to the Incomplete LDL (InLDL). However, InLDL often suffers from performance degeneration. To remedy it, existing methods need one or more explicit regularizations, leading to burdensome parameter tuning and extra computation. We argue that label distribution itself may provide useful prior, when used appropriately, the InLDL problem can be solved without any explicit regularization. In this paper, we offer a rational alternative to use such a prior. Our intuition is that large degrees are likely to get more concern, the small ones are easily overlooked, whereas the missing degrees are completely neglected in InLDL. To learn an accurate label distribution, it is crucial not to ignore the small observed degrees but to give them properly large weights, while gradually increasing the weights of the missing degrees. To this end, we first define a weighted empirical risk and derive upper bounds between the expected risk and the weighted empirical risk, which reveals in principle that weighting plays an implicit regularization role. Then, by using the prior of degrees, we design a weighted scheme and verify its effectiveness. To sum up, our model has four advantages, it is 1) model selection free, as no explicit regularization is imposed; 2) with closed form solution (sub-problem) and easy-to-implement (a few lines of codes); 3) with linear computational complexity in the number of samples, thus scalable to large datasets; 4) competitive with state-of-the-arts even without any explicit regularization.
翻译:标签分布学习(LDL)为样本分配软标签(即度数)。现实中,获取完整的度数通常耗时费力,由此产生了不完整标签分布学习(InLDL)。然而,InLDL常面临性能退化问题。为弥补这一缺陷,现有方法需要一种或多种显式正则化,导致繁琐的参数调优和额外计算。我们认为,当合理使用时,标签分布本身可能提供有用先验,无需任何显式正则化即可解决InLDL问题。本文提出了一种利用这种先验的合理替代方案。我们的直觉是:大度数更易受到关注,小度数常被忽视,而缺失度数在InLDL中则完全被忽略。为学习准确的标签分布,关键不仅不能忽略小的已观测度数,反而应给予其适当大的权重,同时逐步增加缺失度数的权重。为此,我们首先定义了加权经验风险,并推导了期望风险与加权经验风险之间的上界,这从原理上揭示了加权扮演了隐式正则化角色。然后,利用度数的先验,我们设计了一个加权方案并验证了其有效性。综上所述,我们的模型具有四个优势:1)无需模型选择,因为未施加显式正则化;2)具有闭式解(子问题)且易于实现(只需几行代码);3)计算复杂度与样本数量成线性关系,因此可扩展至大规模数据集;4)即使没有显式正则化,其性能也与最先进方法相当。