In this paper, we introduce the imprecise label learning (ILL) framework, a unified approach to handle various imprecise label configurations, which are commonplace challenges in machine learning tasks. ILL leverages an expectation-maximization (EM) algorithm for the maximum likelihood estimation (MLE) of the imprecise label information, treating the precise labels as latent variables. Compared to previous versatile methods attempting to infer correct labels from the imprecise label information, our ILL framework considers all possible labeling imposed by the imprecise label information, allowing a unified solution to deal with any imprecise labels. With comprehensive experimental results, we demonstrate that ILL can seamlessly adapt to various situations, including partial label learning, semi-supervised learning, noisy label learning, and a mixture of these settings. Notably, our simple method surpasses the existing techniques for handling imprecise labels, marking the first unified framework with robust and effective performance across various imprecise labels. We believe that our approach has the potential to significantly enhance the performance of machine learning models on tasks where obtaining precise labels is expensive and complicated. We hope our work will inspire further research on this topic with an open-source codebase release.
翻译:本文提出了不精确标签学习(ILL)框架,这是一种统一的方法,用于处理机器学习任务中普遍存在的各类不精确标签配置。ILL利用期望最大化(EM)算法对不精确标签信息进行最大似然估计(MLE),将精确标签视为潜在变量。与以往试图从不精确标签信息中推断出正确标签的通用方法相比,我们的ILL框架考虑了不精确标签信息所施加的所有可能的标注方式,从而能够以统一方案处理任何类型的不精确标签。通过全面的实验结果,我们证明ILL能够无缝适应多种场景,包括部分标签学习、半监督学习、噪声标签学习以及这些设置的混合情况。值得注意的是,我们这种简单方法超越了现有的不精确标签处理技术,成为首个在各种不精确标签场景下均具备稳健且有效性能的统一框架。我们相信,该方法在获取精确标签代价高昂且复杂的任务中,能够显著提升机器学习模型的性能。希望我们的工作能通过开源代码库的发布,进一步激发该方向的后续研究。