We present an efficient parameter-free approach for statistical learning from corrupted training sets. We identify corrupted and non-corrupted samples using latent Bernoulli variables, and therefore formulate the robust learning problem as maximization of the likelihood where latent variables are marginalized out. The resulting optimization problem is solved via variational inference using an efficient Expectation-Maximization based method. The proposed approach improves over the state-of-the-art by automatically inferring the corruption level and identifying outliers, while adding minimal computational overhead. We demonstrate our robust learning method on a wide variety of machine learning tasks including online learning and deep learning where it exhibits ability to adapt to different levels of noise and attain high prediction accuracy.
翻译:我们提出了一种高效的无参数统计学习方法,用于从含噪训练集中进行学习。通过潜在伯努利变量识别受污染样本与正常样本,将鲁棒学习问题形式化为对边缘化潜在变量的似然最大化。所导出的优化问题采用基于期望最大化原则的高效变分推理方法求解。该方案通过自动推断污染程度并识别异常值超越现有技术水平,同时仅增加极小的计算开销。我们在涵盖在线学习与深度学习的多样化机器学习任务中验证了该方法,结果表明其能够自适应不同噪声水平并保持高预测精度。