This work introduces Dirichlet Active Learning (DiAL), a Bayesian-inspired approach to the design of active learning algorithms. Our framework models feature-conditional class probabilities as a Dirichlet random field and lends observational strength between similar features in order to calibrate the random field. This random field can then be utilized in learning tasks: in particular, we can use current estimates of mean and variance to conduct classification and active learning in the context where labeled data is scarce. We demonstrate the applicability of this model to low-label rate graph learning by constructing ``propagation operators'' based upon the graph Laplacian, and offer computational studies demonstrating the method's competitiveness with the state of the art. Finally, we provide rigorous guarantees regarding the ability of this approach to ensure both exploration and exploitation, expressed respectively in terms of cluster exploration and increased attention to decision boundaries.
翻译:本文提出狄利克雷主动学习(Dirichlet Active Learning, DiAL),一种受贝叶斯思想启发的主动学习算法设计方法。该框架将特征条件类别概率建模为狄利克雷随机场,并通过在相似特征之间传递观测强度来校准该随机场。该随机场可进一步应用于学习任务:特别地,我们能够在标记数据稀缺的场景下,利用当前均值和方差的估计值进行分类与主动学习。通过构建基于图拉普拉斯的"传播算子",我们展示了该模型在低标记率图学习中的适用性,并通过计算实验证明了该方法与现有技术水平相比的竞争力。最后,我们就该方法确保探索与利用的能力提供了严格的理论保证——分别体现为聚类探索与对决策边界关注度的提升。