Boundary-RL: Reinforcement Learning for Weakly-Supervised Prostate Segmentation in TRUS Images

We propose Boundary-RL, a novel weakly supervised segmentation method that utilises only patch-level labels for training. We envision the segmentation as a boundary detection problem, rather than a pixel-level classification as in previous works. This outlook on segmentation may allow for boundary delineation under challenging scenarios such as where noise artefacts may be present within the region-of-interest (ROI) boundaries, where traditional pixel-level classification-based weakly supervised methods may not be able to effectively segment the ROI. Particularly of interest, ultrasound images, where intensity values represent acoustic impedance differences between boundaries, may also benefit from the boundary delineation approach. Our method uses reinforcement learning to train a controller function to localise boundaries of ROIs using a reward derived from a pre-trained boundary-presence classifier. The classifier indicates when an object boundary is encountered within a patch, as the controller modifies the patch location in a sequential Markov decision process. The classifier itself is trained using only binary patch-level labels of object presence, which are the only labels used during training of the entire boundary delineation framework, and serves as a weak signal to inform the boundary delineation. The use of a controller function ensures that a sliding window over the entire image is not necessary. It also prevents possible false-positive or -negative cases by minimising number of patches passed to the boundary-presence classifier. We evaluate our proposed approach for a clinically relevant task of prostate gland segmentation on trans-rectal ultrasound images. We show improved performance compared to other tested weakly supervised methods, using the same labels e.g., multiple instance learning.

翻译：摘要：本文提出Boundary-RL，一种仅利用图像块级标签进行训练的新型弱监督分割方法。与以往将分割视为像素级分类任务的研究不同，我们将分割视为边界检测问题。这种分割视角能够在感兴趣区域（ROI）边界内存在噪声伪影等具有挑战性的场景中实现边界勾勒，而传统基于像素级分类的弱监督方法可能无法有效分割ROI。特别值得关注的是，超声图像中的强度值代表不同边界之间的声阻抗差异，因此边界勾勒方法可能尤其适用于此类图像。我们的方法使用强化学习训练控制器函数，通过预训练的边界存在性分类器提供的奖励信号来定位ROI边界。当控制器在序列马尔可夫决策过程中修改图像块位置时，该分类器可指示图像块中是否遇到目标边界。分类器本身仅利用对象存在性的二值图像块级标签进行训练——这也是整个边界勾勒框架训练过程中使用的唯一标签——并作为指导边界勾勒的弱监督信号。控制器函数的采用确保了无需对整个图像进行滑动窗口扫描，同时通过最小化传递给边界存在性分类器的图像块数量，避免了可能的假阳性或假阴性案例。我们针对经直肠超声图像中前列腺腺体分割这一临床任务评估了所提方法。实验结果表明，在使用相同标签（如多示例学习）的情况下，我们提出的方法与其他测试过的弱监督方法相比具有更优的分割性能。