Instance segmentation is an important computer vision problem which remains challenging despite impressive recent advances due to deep learning-based methods. Given sufficient training data, fully supervised methods can yield excellent performance, but annotation of ground-truth data remains a major bottleneck, especially for biomedical applications where it has to be performed by domain experts. The amount of labels required can be drastically reduced by using rules derived from prior knowledge to guide the segmentation. However, these rules are in general not differentiable and thus cannot be used with existing methods. Here, we relax this requirement by using stateless actor critic reinforcement learning, which enables non-differentiable rewards. We formulate the instance segmentation problem as graph partitioning and the actor critic predicts the edge weights driven by the rewards, which are based on the conformity of segmented instances to high-level priors on object shape, position or size. The experiments on toy and real datasets demonstrate that we can achieve excellent performance without any direct supervision based only on a rich set of priors.
翻译:实例分割是计算机视觉中的重要问题,尽管基于深度学习的近期进展令人瞩目,但该任务仍具挑战性。在充足训练数据支持下,全监督方法可取得优异性能,但真实标注数据的获取仍是主要瓶颈——尤其在需领域专家标注的生物医学应用中。通过利用基于先验知识推导的规则引导分割,可大幅减少所需标签数量。然而,这些规则通常不可微分,因而无法与现有方法兼容。本文通过采用无状态演员-评论家强化学习放宽了这一限制,该方法支持非微分奖励机制。我们将实例分割问题建模为图分割,由演员-评论家根据奖励预测边权,而奖励基于分割实例对物体形状、位置或尺寸等高层先验的符合程度。在模拟数据集与真实数据集上的实验表明,仅依赖丰富的先验集合、无需任何直接监督,即可达到卓越性能。