Reinforcement learning (RL)-based tractography is a competitive alternative to machine learning and classical tractography algorithms due to its high anatomical accuracy obtained without the need for any annotated data. However, the reward functions so far used to train RL agents do not encapsulate anatomical knowledge which causes agents to generate spurious false positives tracts. In this paper, we propose a new RL tractography system, TractOracle, which relies on a reward network trained for streamline classification. This network is used both as a reward function during training as well as a mean for stopping the tracking process early and thus reduce the number of false positive streamlines. This makes our system a unique method that evaluates and reconstructs WM streamlines at the same time. We report an improvement of true positive ratios by almost 20\% and a reduction of 3x of false positive ratios on one dataset and an increase between 2x and 7x in the number true positive streamlines on another dataset.
翻译:基于强化学习的纤维追踪是机器学习和经典纤维追踪算法的一种有竞争力的替代方案,因其无需任何标注数据即可获得高解剖精度。然而,目前用于训练强化学习智能体的奖励函数并未包含解剖学知识,导致智能体产生虚假的假阳性纤维束。本文提出了一种新的强化学习纤维追踪系统TractOracle,该系统依赖于一个训练用于纤维束分类的奖励网络。该网络在训练过程中既作为奖励函数,也作为提前终止追踪过程以减少假阳性纤维束数量的手段。这使得我们的系统成为同时评估和重建白质纤维束的独特方法。我们报告,在其中一个数据集上真阳性率提升了近20%,假阳性率降低了3倍;在另一个数据集上,真阳性纤维束数量增加了2到7倍。