DALL-M: Context-Aware Clinical Data Augmentation with LLMs

X-ray images are vital in medical diagnostics, but their effectiveness is limited without clinical context. Radiologists often find chest X-rays insufficient for diagnosing underlying diseases, necessitating the integration of structured clinical features with radiology reports. To address this, we introduce DALL-M, a novel framework that enhances clinical datasets by generating contextual synthetic data. DALL-M augments structured patient data, including vital signs (e.g., heart rate, oxygen saturation), radiology findings (e.g., lesion presence), and demographic factors. It integrates this tabular data with contextual knowledge extracted from radiology reports and domain-specific resources (e.g., Radiopaedia, Wikipedia), ensuring clinical consistency and reliability. DALL-M follows a three-phase process: (i) clinical context storage, (ii) expert query generation, and (iii) context-aware feature augmentation. Using large language models (LLMs), it generates both contextual synthetic values for existing clinical features and entirely new, clinically relevant features. Applied to 799 cases from the MIMIC-IV dataset, DALL-M expanded the original 9 clinical features to 91. Empirical validation with machine learning models (including Decision Trees, Random Forests, XGBoost, and TabNET) demonstrated a 16.5% improvement in F1 score and a 25% increase in Precision and Recall. DALL-M bridges an important gap in clinical data augmentation by preserving data integrity while enhancing predictive modeling in healthcare. Our results show that integrating LLM-generated synthetic features significantly improves model performance, making DALL-M a scalable and practical approach for AI-driven medical diagnostics.

翻译：X射线图像在医学诊断中至关重要，但其有效性在缺乏临床背景信息时会受到限制。放射科医师常发现胸部X光片不足以诊断潜在疾病，因此需要将结构化临床特征与放射学报告相结合。为解决这一问题，我们提出了DALL-M，这是一个通过生成上下文合成数据来增强临床数据集的新颖框架。DALL-M对结构化患者数据进行增强，包括生命体征（如心率、血氧饱和度）、放射学发现（如病灶存在）和人口统计学因素。该框架将此表格数据与从放射学报告及特定领域资源（如Radiopaedia、Wikipedia）中提取的上下文知识相整合，确保临床一致性和可靠性。DALL-M遵循三阶段流程：（i）临床上下文存储，（ii）专家查询生成，以及（iii）上下文感知特征增强。利用大型语言模型，它既能为现有临床特征生成上下文合成值，也能生成全新的、具有临床相关性的特征。在MIMIC-IV数据集的799个病例上应用后，DALL-M将原始的9个临床特征扩展至91个。使用机器学习模型（包括决策树、随机森林、XGBoost和TabNET）进行的实证验证表明，F1分数提升了16.5%，精确率和召回率均提高了25%。DALL-M在保持数据完整性的同时，增强了医疗健康领域的预测建模能力，从而弥合了临床数据增强中的一个重要缺口。我们的结果表明，整合LLM生成的合成特征能显著提升模型性能，使得DALL-M成为一种可扩展且实用的AI驱动医学诊断方法。