Positive and unlabelled learning is an important problem which arises naturally in many applications. The significant limitation of almost all existing methods lies in assuming that the propensity score function is constant (SCAR assumption), which is unrealistic in many practical situations. Avoiding this assumption, we consider parametric approach to the problem of joint estimation of posterior probability and propensity score functions. We show that under mild assumptions when both functions have the same parametric form (e.g. logistic with different parameters) the corresponding parameters are identifiable. Motivated by this, we propose two approaches to their estimation: joint maximum likelihood method and the second approach based on alternating maximization of two Fisher consistent expressions. Our experimental results show that the proposed methods are comparable or better than the existing methods based on Expectation-Maximisation scheme.
翻译:正-无标记学习是一个在许多应用中自然出现的重要问题。几乎所有现有方法的一个显著局限性在于假设倾向得分函数为常数(SCAR假设),这在许多实际情境中并不现实。避免这一假设,我们考虑采用参数化方法联合估计后验概率和倾向得分函数。我们证明,在温和假设下,当两个函数具有相同的参数形式(例如,不同参数下的逻辑函数)时,相应参数是可识别的。受此启发,我们提出两种估计方法:联合最大似然法,以及基于交替最大化两个Fisher一致性表达式的第二种方法。实验结果表明,我们所提出的方法与基于期望最大化(EM)方案的现有方法相比具有可比性或更优的性能。