Multiple Instance Learning (MIL) is a weakly supervised paradigm that has been successfully applied to many different scientific areas and is particularly well suited to medical imaging. Probabilistic MIL methods, and more specifically Gaussian Processes (GPs), have achieved excellent results due to their high expressiveness and uncertainty quantification capabilities. One of the most successful GP-based MIL methods, VGPMIL, resorts to a variational bound to handle the intractability of the logistic function. Here, we formulate VGPMIL using P\'olya-Gamma random variables. This approach yields the same variational posterior approximations as the original VGPMIL, which is a consequence of the two representations that the Hyperbolic Secant distribution admits. This leads us to propose a general GP-based MIL method that takes different forms by simply leveraging distributions other than the Hyperbolic Secant one. Using the Gamma distribution we arrive at a new approach that obtains competitive or superior predictive performance and efficiency. This is validated in a comprehensive experimental study including one synthetic MIL dataset, two well-known MIL benchmarks, and a real-world medical problem. We expect that this work provides useful ideas beyond MIL that can foster further research in the field.
翻译:多实例学习是一种弱监督范式,已成功应用于多个科学领域,尤其适用于医学影像分析。概率性多实例学习方法,特别是高斯过程,因其高表达能力和不确定性量化能力而取得优异效果。在基于高斯过程的多实例学习方法中,VGPMIL通过变分界限处理Logistic函数的难解性。本文利用Pólya-Gamma随机变量重新构建VGPMIL。该方法能够获得与原始VGPMIL相同的变分后验近似,这是由双曲正割分布的两种表示形式共同决定的特性。由此我们提出一种通用的基于高斯过程的多实例学习方法,只需利用除双曲正割分布外的其他分布即可形成不同变体。采用Gamma分布时,我们得到了一种在预测性能和计算效率上具有竞争力甚至更优的新方法。这一结论在包含合成多实例学习数据集、两个经典多实例学习基准以及一个真实医学问题的综合实验研究中得到验证。我们期待本文提出的思想能够推动该领域更深入的研究。