Fairness in clinical prediction models remains a persistent challenge, particularly in high-stakes applications such as spinal fusion surgery for scoliosis, where patient outcomes exhibit substantial heterogeneity. Many existing fairness approaches rely on coarse demographic adjustments or post-hoc corrections, which fail to capture the latent structure of clinical populations and may unintentionally reinforce bias. We propose FAIR-MTL, a fairness-aware multitask learning framework designed to provide equitable and fine-grained prediction of postoperative complication severity. Instead of relying on explicit sensitive attributes during model training, FAIR-MTL employs a data-driven subgroup inference mechanism. We extract a compact demographic embedding, and apply k-means clustering to uncover latent patient subgroups that may be differentially affected by traditional models. These inferred subgroup labels determine task routing within a shared multitask architecture. During training, subgroup imbalance is mitigated through inverse-frequency weighting, and regularization prevents overfitting to smaller groups. Applied to postoperative complication prediction with four severity levels, FAIR-MTL achieves an AUC of 0.86 and an accuracy of 75%, outperforming single-task baselines while substantially reducing bias. For gender, the demographic parity difference decreases to 0.055 and equalized odds to 0.094; for age, these values reduce to 0.056 and 0.148, respectively. Model interpretability is ensured through SHAP and Gini importance analyses, which consistently highlight clinically meaningful predictors such as hemoglobin, hematocrit, and patient weight. Our findings show that incorporating unsupervised subgroup discovery into a multitask framework enables more equitable, interpretable, and clinically actionable predictions for surgical risk stratification.
翻译:临床预测模型的公平性仍是一个持续存在的挑战,尤其在脊柱侧凸脊柱融合手术等高风险应用中,患者结局表现出显著的异质性。许多现有的公平性方法依赖于粗略的人口统计学调整或事后校正,这些方法未能捕捉临床人群的潜在结构,并可能无意中强化偏见。我们提出了FAIR-MTL,一个公平感知的多任务学习框架,旨在提供术后并发症严重程度的公平且细粒度的预测。FAIR-MTL在模型训练过程中不依赖显式的敏感属性,而是采用数据驱动的子群推断机制。我们提取紧凑的人口统计学嵌入,并应用k-means聚类来揭示可能受传统模型不同影响的潜在患者子群。这些推断的子群标签决定了共享多任务架构内的任务路由。在训练过程中,通过逆频率加权缓解子群不平衡,正则化防止了对较小群体的过拟合。应用于包含四个严重等级的术后并发症预测时,FAIR-MTL实现了0.86的AUC和75%的准确率,优于单任务基线,同时显著减少了偏见。对于性别,人口统计均等差异降至0.055,均等化机会降至0.094;对于年龄,这些值分别降至0.056和0.148。通过SHAP和基尼重要性分析确保了模型的可解释性,这些分析一致地突出了具有临床意义的预测因子,如血红蛋白、血细胞比容和患者体重。我们的研究结果表明,将无监督子群发现纳入多任务框架,能够为手术风险分层提供更公平、可解释且具有临床可操作性的预测。