Ensemble Learning for Healthcare: A Comparative Analysis of Hybrid Voting and Ensemble Stacking in Obesity Risk Prediction

Obesity is a critical global health issue driven by dietary, physiological, and environmental factors, and is strongly associated with chronic diseases such as diabetes, cardiovascular disorders, and cancer. Machine learning has emerged as a promising approach for early obesity risk prediction, yet a comparative evaluation of ensemble techniques -- particularly hybrid majority voting and ensemble stacking -- remains limited. This study aims to compare hybrid majority voting and ensemble stacking methods for obesity risk prediction, identifying which approach delivers higher accuracy and efficiency. The analysis seeks to highlight the complementary strengths of these ensemble techniques in guiding better predictive model selection for healthcare applications. Two datasets were utilized to evaluate three ensemble models: Majority Hard Voting, Weighted Hard Voting, and Stacking (with a Multi-Layer Perceptron as meta-classifier). A pool of nine Machine Learning (ML) algorithms, evaluated across a total of 50 hyperparameter configurations, was analyzed to identify the top three models to serve as base learners for the ensemble methods. Preprocessing steps involved dataset balancing, and outlier detection, and model performance was evaluated using Accuracy and F1-Score. On Dataset-1, weighted hard voting and stacking achieved nearly identical performance (Accuracy: 0.920304, F1: 0.920070), outperforming majority hard voting. On Dataset-2, stacking demonstrated superior results (Accuracy: 0.989837, F1: 0.989825) compared to majority hard voting (Accuracy: 0.981707, F1: 0.981675) and weighted hard voting, which showed the lowest performance. The findings confirm that ensemble stacking provides stronger predictive capability, particularly for complex data distributions, while hybrid majority voting remains a robust alternative.

翻译：肥胖是一个由饮食、生理和环境因素驱动的全球性关键健康问题，与糖尿病、心血管疾病和癌症等慢性病密切相关。机器学习已成为早期肥胖风险预测中颇具前景的方法，然而，对集成技术（特别是混合多数投票和集成堆叠）的比较评估仍然有限。本研究旨在比较混合多数投票和集成堆叠方法在肥胖风险预测中的表现，以确定哪种方法能提供更高的准确性和效率。该分析旨在突出这些集成技术在指导医疗应用中选择更优预测模型方面的互补优势。本研究利用两个数据集评估了三种集成模型：多数硬投票、加权硬投票和堆叠（以多层感知器作为元分类器）。研究分析了共50个超参数配置下评估的九个机器学习算法池，从中筛选出表现最佳的前三个模型作为集成方法的基础学习器。预处理步骤包括数据集平衡和异常值检测，模型性能通过准确率和F1分数进行评估。在数据集1上，加权硬投票和堆叠实现了几乎相同的性能（准确率：0.920304，F1：0.920070），优于多数硬投票。在数据集2上，堆叠方法（准确率：0.989837，F1：0.989825）相比多数硬投票（准确率：0.981707，F1：0.981675）和表现最低的加权硬投票展现出更优的结果。研究结果证实，集成堆叠提供了更强的预测能力，特别是在处理复杂数据分布时，而混合多数投票仍然是一种稳健的替代方案。