Positive-unlabeled (PU) learning aims to train a classifier using the data containing only labeled-positive instances and unlabeled instances. However, existing PU learning methods are generally hard to achieve satisfactory performance on trifurcate data, where the positive instances distribute on both sides of the negative instances. To address this issue, firstly we propose a PU classifier with asymmetric loss (PUAL), by introducing a structure of asymmetric loss on positive instances into the objective function of the global and local learning classifier. Then we develop a kernel-based algorithm to enable PUAL to obtain non-linear decision boundary. We show that, through experiments on both simulated and real-world datasets, PUAL can achieve satisfactory classification on trifurcate data.
翻译:正未标记(PU)学习旨在利用仅包含标记正实例和未标记实例的数据来训练分类器。然而,现有PU学习方法通常难以在三分叉数据上取得令人满意的性能,此类数据中正实例分布于负实例的两侧。为解决该问题,我们首先提出一种采用非对称损失的正未标记分类器(PUAL),通过在全局与局部学习分类器的目标函数中引入针对正实例的非对称损失结构。随后,我们开发了一种基于核函数的算法,使PUAL能够获得非线性决策边界。通过在模拟数据集和真实数据集上的实验表明,PUAL能够在三分叉数据上实现令人满意的分类性能。