Machine learning models deployed in critical care settings exhibit demographic biases, particularly gender disparities, that undermine clinical trust and equitable treatment. This paper introduces FairMed-XGB, a novel framework that systematically detects and mitigates gender-based prediction bias while preserving model performance and transparency. The framework integrates a fairness-aware loss function combining Statistical Parity Difference, Theil Index, and Wasserstein Distance, jointly optimised via Bayesian Search into an XGBoost classifier. Post-mitigation evaluation on seven clinically distinct cohorts derived from the MIMIC-IV-ED and eICU databases demonstrates substantial bias reduction: Statistical Parity Difference decreases by 40 to 51 percent on MIMIC-IV-ED and 10 to 19 percent on eICU; Theil Index collapses by four to five orders of magnitude to near-zero values; Wasserstein Distance is reduced by 20 to 72 percent. These gains are achieved with negligible degradation in predictive accuracy (AUC-ROC drop <0.02). SHAP-based explainability reveals that the framework diminishes reliance on gender-proxy features, providing clinicians with actionable insights into how and where bias is corrected. FairMed-XGB offers a robust, interpretable, and ethically aligned solution for equitable clinical decision-making, paving the way for trustworthy deployment of AI in high-stakes healthcare environments.
翻译:在重症监护环境中部署的机器学习模型存在人口统计学偏差,尤其是性别差异,这损害了临床信任与公平治疗。本文提出FairMed-XGB,这是一种新颖的框架,能够系统性地检测并减轻基于性别的预测偏差,同时保持模型性能与透明度。该框架将结合了统计均等差异、泰尔指数和瓦瑟斯坦距离的公平感知损失函数,通过贝叶斯搜索联合优化并集成到XGBoost分类器中。基于MIMIC-IV-ED和eICU数据库构建的七个临床异质队列的缓解后评估显示,偏差显著降低:在MIMIC-IV-ED上统计均等差异减少了40%至51%,在eICU上减少了10%至19%;泰尔指数下降了四到五个数量级至接近零值;瓦瑟斯坦距离降低了20%至72%。这些改进是在预测准确性几乎未受损失的情况下实现的(AUC-ROC下降<0.02)。基于SHAP的可解释性分析表明,该框架减少了对性别代理特征的依赖,为临床医生提供了关于偏差如何及在何处被纠正的可操作见解。FairMed-XGB为公平的临床决策提供了一个稳健、可解释且符合伦理的解决方案,为在高风险医疗环境中可信赖地部署人工智能铺平了道路。