Multiple Instance Learning (MIL) is a weakly-supervised problem in which one label is assigned to the whole bag of instances. An important class of MIL models is instance-based, where we first classify instances and then aggregate those predictions to obtain a bag label. The most common MIL model is when we consider a bag as positive if at least one of its instances has a positive label. However, this reasoning does not hold in many real-life scenarios, where the positive bag label is often a consequence of a certain percentage of positive instances. To address this issue, we introduce a dedicated instance-based method called ProMIL, based on deep neural networks and Bernstein polynomial estimation. An important advantage of ProMIL is that it can automatically detect the optimal percentage level for decision-making. We show that ProMIL outperforms standard instance-based MIL in real-world medical applications. We make the code available.
翻译:多实例学习(MIL)是一种弱监督问题,其中每个包整体被赋予一个标签。一类重要的MIL模型是基于实例的方法,即先对实例进行分类,再聚合这些预测结果以获取包标签。最常见的MIL模型将包视为正包,当且仅当其中至少一个实例具有正标签。然而,这种推理方式在许多实际场景中并不成立,因为正包标签往往是由一定比例的正实例共同导致的。为解决这一问题,我们提出了一种专用的基于实例的方法ProMIL,其基于深度神经网络和伯恩斯坦多项式估计。ProMIL的一个重要优势在于能够自动检测决策所需的最优比例水平。实验表明,ProMIL在真实医学应用中优于标准的基于实例的MIL方法。我们公开了相关代码。