Multiple Sclerosis (MS) is a chronic disease characterized by progressive or alternate impairment of neurological functions (motor, sensory, visual, and cognitive). Predicting disease progression with a probabilistic and time-dependent approach might help in suggesting interventions that can delay the progression of the disease. However, extracting informative knowledge from irregularly collected longitudinal data is difficult, and missing data pose significant challenges. MS progression is measured through the Expanded Disability Status Scale (EDSS), which quantifies and monitors disability in MS over time. EDSS assesses impairment in eight functional systems (FS). Frequently, only the EDSS score assigned by clinicians is reported, while FS sub-scores are missing. Imputing these scores might be useful, especially to stratify patients according to their phenotype assessed over the disease progression. This study aimed at i) exploring different methodologies for imputing missing FS sub-scores, and ii) predicting the EDSS score using complete clinical data. Results show that Exponential Weighted Moving Average achieved the lowest error rate in the missing data imputation task; furthermore, the combination of Classification and Regression Trees for the imputation and SVM for the prediction task obtained the best accuracy.
翻译:多发性硬化症(MS)是一种以神经功能(运动、感觉、视觉和认知)进行性或交替性损伤为特征的慢性疾病。采用概率性和时间依赖性的方法预测疾病进展,可能有助于提出延缓疾病进展的干预措施。然而,从不规则收集的纵向数据中提取信息性知识较为困难,且缺失数据带来了重大挑战。MS进展通过扩展残疾状态量表(EDSS)进行测量,该量表可量化并随时间监测MS患者的残疾程度。EDSS评估八个功能系统(FS)的损伤情况。临床上通常仅报告医生评定的EDSS总分,而FS子评分往往缺失。对这些评分进行插补可能具有重要价值,尤其有助于根据疾病进展过程中评估的表型对患者进行分层。本研究旨在:i)探索不同方法用于插补缺失的FS子评分;ii)利用完整的临床数据预测EDSS评分。结果表明,在缺失数据插补任务中,指数加权移动平均法实现了最低的错误率;此外,采用分类回归树进行插补并结合支持向量机进行预测的组合方法获得了最佳准确率。