Heart disease is the major cause of non-communicable and silent death worldwide. Heart diseases or cardiovascular diseases are classified into four types: coronary heart disease, heart failure, congenital heart disease, and cardiomyopathy. It is vital to diagnose heart disease early and accurately in order to avoid further injury and save patients' lives. As a result, we need a system that can predict cardiovascular disease before it becomes a critical situation. Machine learning has piqued the interest of researchers in the field of medical sciences. For heart disease prediction, researchers implement a variety of machine learning methods and approaches. In this work, to the best of our knowledge, we have used the dataset from IEEE Data Port which is one of the online available largest datasets for cardiovascular diseases individuals. The dataset isa combination of Hungarian, Cleveland, Long Beach VA, Switzerland & Statlog datasets with important features such as Maximum Heart Rate Achieved, Serum Cholesterol, Chest Pain Type, Fasting blood sugar, and so on. To assess the efficacy and strength of the developed model, several performance measures are used, such as ROC, AUC curve, specificity, F1-score, sensitivity, MCC, and accuracy. In this study, we have proposed a framework with a stacked ensemble classifier using several machine learning algorithms including ExtraTrees Classifier, Random Forest, XGBoost, and so on. Our proposed framework attained an accuracy of 92.34% which is higher than the existing literature.
翻译:心脏病是全球非传染性疾病和无声死亡的主要原因。心脏病或心血管疾病分为四种类型:冠心病、心力衰竭、先天性心脏病和心肌病。早期准确诊断心脏病对于避免进一步损伤和挽救患者生命至关重要。因此,我们需要一个能够在心血管疾病演变为危急状况前进行预测的系统。机器学习引起了医学领域研究人员的兴趣。在心脏疾病预测中,研究者采用了多种机器学习方法和技术。在本研究中,据我们所知,我们使用了IEEE数据门户提供的数据集,这是目前公开可用的最大规模心血管疾病个体数据集之一。该数据集整合了匈牙利、克利夫兰、长滩VA、瑞士和Statlog等数据集,包含最大心率、血清胆固醇、胸痛类型、空腹血糖等重要特征。为评估所开发模型的效能与稳健性,我们采用了多种性能指标,包括ROC曲线、AUC曲线、特异性、F1分数、灵敏度、马修斯相关系数和准确率。本研究提出了一种基于堆叠集成分类器的框架,集成了包括ExtraTrees分类器、随机森林、XGBoost等多种机器学习算法。我们提出的框架达到了92.34%的准确率,高于现有文献报道的结果。