We provide a theoretical and computational investigation of the Gamma-Maximin method with soft revision, which was recently proposed as a robust criterion for pseudo-label selection (PLS) in semi-supervised learning. Opposed to traditional methods for PLS we use credal sets of priors ("generalized Bayes") to represent the epistemic modeling uncertainty. These latter are then updated by the Gamma-Maximin method with soft revision. We eventually select pseudo-labeled data that are most likely in light of the least favorable distribution from the so updated credal set. We formalize the task of finding optimal pseudo-labeled data w.r.t. the Gamma-Maximin method with soft revision as an optimization problem. A concrete implementation for the class of logistic models then allows us to compare the predictive power of the method with competing approaches. It is observed that the Gamma-Maximin method with soft revision can achieve very promising results, especially when the proportion of labeled data is low.
翻译:本文对带有软修正的Gamma-Maximin方法进行了理论与计算研究,该方法近期被提出作为半监督学习中伪标签选择(PLS)的一种鲁棒性准则。与传统PLS方法不同,我们采用先验的置信集(“广义贝叶斯”)来表示认知建模不确定性。这些置信集随后通过带有软修正的Gamma-Maximin方法进行更新。最终,我们选择在更新后置信集中最不利分布下最可能出现的伪标记数据。我们将寻找相对于带有软修正的Gamma-Maximin方法的最优伪标记数据任务形式化为一个优化问题。针对逻辑模型类的具体实现使我们能够将该方法的预测能力与竞争方法进行比较。实验观察到,带有软修正的Gamma-Maximin方法能够取得非常有前景的结果,尤其是在标记数据比例较低的情况下。