Background: Ventilator-associated pneumonia (VAP) in traumatic brain injury (TBI) patients poses a significant mortality risk and imposes a considerable financial burden on patients and healthcare systems. Timely detection and prognostication of VAP in TBI patients are crucial to improve patient outcomes and alleviate the strain on healthcare resources. Methods: We implemented six machine learning models using the MIMIC-III database. Our methodology included preprocessing steps, such as feature selection with CatBoost and expert opinion, addressing class imbalance with the Synthetic Minority Oversampling Technique (SMOTE), and rigorous model tuning through 5-fold cross-validation to optimize hyperparameters. Key models evaluated included SVM, Logistic Regression, Random Forest, XGBoost, ANN, and AdaBoost. Additionally, we conducted SHAP analysis to determine feature importance and performed an ablation study to assess feature impacts on model performance. Results: XGBoost outperformed the baseline models and the best existing literature. We used metrics, including AUC, Accuracy, Specificity, Sensitivity, F1 Score, PPV, and NPV. XGBoost demonstrated the highest performance with an AUC of 0.940 and an Accuracy of 0.875, which are 23.4% and 23.5% higher than the best results in the existing literature, with an AUC of 0.706 and an Accuracy of 0.640, respectively. This enhanced performance underscores the models' effectiveness in clinical settings. Conclusions: This study enhances the predictive modeling of VAP in TBI patients, improving early detection and intervention potential. Refined feature selection and advanced ensemble techniques significantly boosted model accuracy and reliability, offering promising directions for future clinical applications and medical diagnostics research.
翻译:背景:创伤性脑损伤(TBI)患者发生呼吸机相关性肺炎(VAP)具有显著的死亡风险,并对患者及医疗系统造成沉重的经济负担。及时检测和预测TBI患者的VAP对于改善患者预后、减轻医疗资源压力至关重要。方法:我们利用MIMIC-III数据库构建了六种机器学习模型。我们的方法包括预处理步骤,例如使用CatBoost和专家意见进行特征选择,采用合成少数类过采样技术(SMOTE)处理类别不平衡问题,并通过5折交叉验证进行严格的模型调优以优化超参数。评估的关键模型包括SVM、逻辑回归、随机森林、XGBoost、人工神经网络(ANN)和AdaBoost。此外,我们进行了SHAP分析以确定特征重要性,并执行了消融研究以评估特征对模型性能的影响。结果:XGBoost模型的表现优于基线模型及现有文献中的最佳结果。我们使用了包括AUC、准确率、特异性、敏感性、F1分数、阳性预测值(PPV)和阴性预测值(NPV)在内的指标进行评估。XGBoost模型取得了最佳性能,其AUC为0.940,准确率为0.875,相较于现有文献中AUC为0.706、准确率为0.640的最佳结果,分别提高了23.4%和23.5%。这一性能提升凸显了该模型在临床环境中的有效性。结论:本研究增强了TBI患者VAP的预测建模能力,提高了早期检测和干预的潜力。精细化的特征选择和先进的集成技术显著提升了模型的准确性和可靠性,为未来的临床应用和医学诊断研究提供了有前景的方向。