Label Distribution Learning (LDL) is a novel machine learning paradigm that addresses the problem of label ambiguity and has found widespread applications. Obtaining complete label distributions in real-world scenarios is challenging, which has led to the emergence of Incomplete Label Distribution Learning (InLDL). However, the existing InLDL methods overlook a crucial aspect of LDL data: the inherent imbalance in label distributions. To address this limitation, we propose \textbf{Incomplete and Imbalance Label Distribution Learning (I\(^2\)LDL)}, a framework that simultaneously handles incomplete labels and imbalanced label distributions. Our method decomposes the label distribution matrix into a low-rank component for frequent labels and a sparse component for rare labels, effectively capturing the structure of both head and tail labels. We optimize the model using the Alternating Direction Method of Multipliers (ADMM) and derive generalization error bounds via Rademacher complexity, providing strong theoretical guarantees. Extensive experiments on 15 real-world datasets demonstrate the effectiveness and robustness of our proposed framework compared to existing InLDL methods.
翻译:标签分布学习(LDL)是一种解决标签模糊性问题的新型机器学习范式,已获得广泛应用。在实际场景中获取完整的标签分布具有挑战性,这推动了不完全标签分布学习(InLDL)的发展。然而,现有InLDL方法忽视了LDL数据的一个关键特性:标签分布固有的不平衡性。为克服这一局限,我们提出\textbf{不完全与不平衡标签分布学习(I\(^2\)LDL)}框架,该框架能同时处理不完全标签与不平衡标签分布。我们的方法将标签分布矩阵分解为针对高频标签的低秩分量与针对稀有标签的稀疏分量,从而有效捕捉头部与尾部标签的结构特征。我们采用交替方向乘子法(ADMM)对模型进行优化,并通过Rademacher复杂度推导泛化误差界,提供了坚实的理论保证。在15个真实数据集上的大量实验表明,相较于现有InLDL方法,我们提出的框架具有显著的有效性与鲁棒性。