Reinforcement learning using Deep Q Networks and Q learning accurately localizes brain tumors on MRI with very small training sets

Purpose Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets, 2) It is non-generalizable, and 3) It lacks explainability and intuition. We have recently proposed Reinforcement Learning to address all threes. However, we applied it to images with radiologist eye tracking points, which limits the state-action space. Here we generalize the Deep-Q Learning to a gridworld-based environment, so that only the images and image masks are required. Materials and Methods We trained a Deep Q network on 30 two-dimensional image slices from the BraTS brain tumor database. Each image contained one lesion. We then tested the trained Deep Q network on a separate set of 30 testing set images. For comparison, we also trained and tested a keypoint detection supervised deep learning network for the same set of training / testing images. Results Whereas the supervised approach quickly overfit the training data, and predicably performed poorly on the testing set (11\% accuracy), the Deep-Q learning approach showed progressive improved generalizability to the testing set over training time, reaching 70\% accuracy. Conclusion We have shown a proof-of-principle application of reinforcement learning to radiological images, here using 2D contrast-enhanced MRI brain images with the goal of localizing brain tumors. This represents a generalization of recent work to a gridworld setting, naturally suitable for analyzing medical images.

翻译：在放射学中,监督深造的目的受到臭名昭著的内在限制:(1) 它需要大量手语附加说明的数据集,(2) 它不具有可概括性,(3) 它缺乏解释性和直觉。我们最近建议加强学习,以解决所有三个问题。然而,我们将它应用到有放射学家眼跟踪点的图像中,这限制了国家行动空间。我们在这里将深Q学习推广到一个基于网格的环境,因此只需要图像和图像面具。我们用30个双维图像数据库对深Q网络进行了材料和方法培训。每张图像都含有一个缺陷。然后,我们用一组30个测试数据集的图像对经过训练的深Q网络进行了测试。为了比较,我们还培训和测试了一组培训/测试图像的深度学习网络。虽然监督方法很快超出了基于网格的训练数据,而测试集(11 ⁇ 准确性),但深Q学习方法显示,在培训时间里,每张图像中都包含一个缺陷。我们用一组30个测试集图像分别测试。为了比较,我们还用一个关键点检测,我们在这里用一个辐射测试的常规图像升级,我们展示了一种实验室的系统, 的实验室的升级的系统,这里的系统图像学的升级的升级, 。