We introduce RACH-Space, an algorithm for labelling unlabelled data in weakly supervised learning, given incomplete, noisy information about the labels. RACH-Space offers simplicity in implementation without requiring hard assumptions on data or the sources of weak supervision, and is well suited for practical applications where fully labelled data is not available. Our method is built upon a geometrical interpretation of the space spanned by the set of weak signals. We also analyze the theoretical properties underlying the relationship between the convex hulls in this space and the accuracy of our output labels, bridging geometry with machine learning. Empirical results demonstrate that RACH-Space works well in practice and compares favorably to the best existing label models for weakly supervised learning.
翻译:我们提出RACH-Space算法,用于在弱监督学习中为标签信息不完整且含噪声的无标注数据生成标签。该算法实现简洁,无需对数据或弱监督来源施加严格假设,特别适用于难以获得完整标注数据的实际应用场景。本方法基于对弱信号集合所张成空间的几何解释构建,同时分析了该空间中凸包与输出标签准确性之间的理论关联,实现了几何学与机器学习的交叉融合。实验结果表明,RACH-Space在实际应用中表现优异,其性能可媲美现有最优的弱监督标签模型。