Digital whole slides images contain an enormous amount of information providing a strong motivation for the development of automated image analysis tools. Particularly deep neural networks show high potential with respect to various tasks in the field of digital pathology. However, a limitation is given by the fact that typical deep learning algorithms require (manual) annotations in addition to the large amounts of image data, to enable effective training. Multiple instance learning exhibits a powerful tool for learning deep neural networks in a scenario without fully annotated data. These methods are particularly effective in this domain, due to the fact that labels for a complete whole slide image are often captured routinely, whereas labels for patches, regions or pixels are not. This potential already resulted in a considerable number of publications, with the majority published in the last three years. Besides the availability of data and a high motivation from the medical perspective, the availability of powerful graphics processing units exhibits an accelerator in this field. In this paper, we provide an overview of widely and effectively used concepts of used deep multiple instance learning approaches, recent advances and also critically discuss remaining challenges and future potential.
翻译:数字全切片图像包含海量信息,这为开发自动化图像分析工具提供了强烈动机。特别是深度神经网络在数字病理学领域的各项任务中展现出巨大潜力。然而,典型深度学习算法除了需要大量图像数据外,还需依赖(人工)标注才能实现有效训练,这构成了应用局限。多实例学习为在缺乏完整标注数据的情境下训练深度神经网络提供了有力工具。由于完整全切片图像的标签通常能通过常规流程获取,而补丁、区域或像素级别的标签却难以获得,此类方法在该领域尤为有效。这一潜力已催生大量研究成果,其中多数发表于近三年。除数据可获得性和医学领域的高度需求外,高性能图形处理器的普及也加速了该领域发展。本文系统梳理了当前广泛有效应用的深度多实例学习方法及其最新进展,同时批判性探讨了现存挑战与未来潜力。