In this paper we propose a reinforcement learning based weakly supervised system for localisation. We train a controller function to localise regions of interest within an image by introducing a novel reward definition that utilises non-binarised classification probability, generated by a pre-trained binary classifier which classifies object presence in images or image crops. The object-presence classifier may then inform the controller of its localisation quality by quantifying the likelihood of the image containing an object. Such an approach allows us to minimize any potential labelling or human bias propagated via human labelling for fully supervised localisation. We evaluate our proposed approach for a task of cancerous lesion localisation on a large dataset of real clinical bi-parametric MR images of the prostate. Comparisons to the commonly used multiple-instance learning weakly supervised localisation and to a fully supervised baseline show that our proposed method outperforms the multi-instance learning and performs comparably to fully-supervised learning, using only image-level classification labels for training.
翻译:本文提出了一种基于强化学习的弱监督定位系统。我们通过引入一种新颖的奖励定义来训练控制器函数定位图像中的感兴趣区域,该奖励利用预训练二元分类器生成的非二值化分类概率,该分类器用于判断图像或图像块中是否存在目标物体。随后,目标存在分类器可通过量化图像包含物体的可能性来告知控制器其定位质量。这种方法使我们能够最大程度地减少通过人工标注传播至全监督定位中的任何潜在标注偏差或人为偏差。我们在真实临床前列腺双参数MRI图像的大规模数据集上评估了所提出的方法,用于癌性病灶定位任务。与常用的多实例学习弱监督定位及全监督基准方法的比较表明,我们的方法在仅使用图像级分类标签进行训练的情况下,性能优于多实例学习,并可达到与全监督学习相当的水平。