MUTE-Reco: MUTual Information Assisted Ensemble Feature RECOmmender System for Healthcare Prognosis

Purpose: Health recommenders act as important decision support systems, aiding patients and medical professionals in taking actions that lead to patients' well-being. These systems extract the information which may be of particular relevance to the end-user, helping them in making appropriate decisions. The present study proposes a feature recommender that identifies and recommends the most important risk factors for healthcare prognosis. Methods: A novel mutual information and ensemble-based feature ranking approach (termed as, MUTE-Reco) considering the rank of features obtained from eight popular feature selection methods, is proposed. Results: To establish the effectiveness of the proposed method, the experiment has been conducted on four benchmark datasets of diverse diseases (clear cell renal cell carcinoma (ccRCC), chronic kidney disease, Indian liver patient, and cervical cancer risk factors). The performance of the proposed recommender is compared with four state-of-the-art methods using recommender systems' performance metrics like average precision@K, precision@K, recall@K, F1@K, reciprocal rank@K. Experimental results show that the model built with the recommended features can attain a higher accuracy (96.6% and 98.6% using support vector machine and neural network, respectively) for classifying different stages of ccRCC with a reduced feature set as compared to existing methods. Moreover, the top two features recommended using the proposed method with ccRCC, viz. size of tumor and metastasis status, are medically validated from the existing TNM system. Results are also found to be superior for the other three datasets. Conclusion: The proposed recommender, MUTE-Reco, can identify and recommend risk factors that have the most discriminating power for detecting diseases.

翻译：目的：健康推荐系统作为重要的决策支持工具，可辅助患者与医疗专业人员采取有利于患者健康的行动。这类系统通过提取与终端用户高度相关的信息，帮助其做出恰当决策。本研究提出一种面向医疗预后评估的特征推荐方法，用于识别并推荐最具影响力的风险因素。方法：创新性地提出基于互信息与集成学习的特征排序方法（简称MUTE-Reco），该方法整合了八种经典特征选择方法所获得的特征排序结果。结果：为验证所提方法的有效性，在四个不同疾病领域的基准数据集（透明细胞肾细胞癌（ccRCC）、慢性肾病、印度肝病患者及宫颈癌风险因素）上进行实验。采用平均精度@K、精度@K、召回率@K、F1@K、倒数排名@K等推荐系统性能指标，将所提推荐方法与四种前沿方法进行对比。实验结果表明，基于推荐特征构建的模型在缩减特征集的情况下，对ccRCC不同阶段分类可获得更高准确率（支持向量机达96.6%，神经网络达98.6%）。此外，针对ccRCC所推荐的前两位特征（肿瘤大小与转移状态）已通过现有TNM分期系统获得医学验证。在其他三个数据集上的实验结果同样表现出优越性。结论：本文提出的MUTE-Reco推荐系统能够有效识别并推荐具有最强疾病鉴别能力的风险因素。