Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI

Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning, and how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characteristics). This paper evaluates several solutions based on two strategies. The first is to use 2D images that summarise MRI scans. The second is to select key features that improve classification accuracy. Additionally, we introduce the novel approach of training a convolutional neural network (CNN) on images that combine regions-of-interest extracted from MRIs, with symbolic representations of tabular data. We evaluate a series of CNN architectures (both 2D and a 3D) that are trained on different representations of MRI and tabular data, to predict whether a composite measure of post-stroke spoken picture description ability is in the aphasic or non-aphasic range. MRI and tabular data were acquired from 758 English speaking stroke survivors who participated in the PLORAS study. The classification accuracy for a baseline logistic regression was 0.678 for lesion size alone, rising to 0.757 and 0.813 when initial symptom severity and recovery time were successively added. The highest classification accuracy 0.854 was observed when 8 regions-of-interest was extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network.Our findings demonstrate how imaging and tabular data can be combined for high post-stroke classification accuracy, even when the dataset is small in machine learning terms. We conclude by proposing how the current models could be improved to achieve even higher levels of accuracy using images from hospital scanners.

翻译：机器学习为卒中后症状及其康复反应的自动预测提供了巨大潜力。该领域面临的主要挑战包括神经影像数据的高维度性、可用数据集规模相对较小，以及如何有效整合神经影像数据与表格数据（如人口统计学信息和临床特征）。本文基于两种策略评估了多种解决方案：其一采用总结MRI扫描的二维图像，其二通过选择关键特征提升分类精度。此外，我们提出了一种创新方法——在结合MRI感兴趣区域提取与表格数据符号化表示的图像上训练卷积神经网络。我们评估了一系列CNN架构（包括2D和3D），这些架构基于MRI与表格数据的不同表征进行训练，用于预测卒中后口语描述综合能力是否处于失语症或非失语症范围。MRI与表格数据来自758名参与PLORAS研究的英语卒中幸存者。仅使用病灶体积的基线逻辑回归分类准确率为0.678，当依次加入初始症状严重程度和恢复时间后，准确率分别提升至0.757和0.813。当从每个MRI扫描中提取8个感兴趣区域，并将其与病灶体积、初始严重程度和恢复时间在二维残差神经网络中结合时，分类准确率最高达到0.854。我们的研究结果表明，即使在机器学习领域数据集较小的情况下，影像与表格数据的结合仍能实现高精度的卒中后分类。最后，我们提出如何利用医院扫描仪的图像改进当前模型以进一步提升准确率。