Deep learning models for chest X-ray diagnosis are constrained by limited coverage of clinically meaningful concept combinations in publicly available training datasets. While synthetic image generation has been explored to increase data diversity, existing methods rarely enforce clinical or anatomical constraints, limiting utility for improving model reliability. We propose CARPA, a clinically aware and anatomically grounded framework for synthetic chest X-ray generation that applies targeted perturbations to clinical concept vectors while preserving anatomical structure. By producing anatomically faithful synthetic images with controlled concept insertions and deletions, CARPA expands clinically relevant concept coverage. We evaluate CARPA across seven backbone architectures by fine-tuning models on synthetic subsets and testing on a held-out MIMIC-CXR benchmark. Compared to prior concept perturbation approaches, fine-tuning on CARPA-generated images consistently improves precision-recall performance, reduces predictive uncertainty, and improves model calibration. Structural and semantic analyses demonstrate high anatomical fidelity, strong concept alignment, and low semantic uncertainty. Evaluation by two expert radiologists further confirms realism and clinical agreement. Together, these results show that anatomically grounded concept perturbations enable more effective use of synthetic data, improving both performance and reliability of chest X-ray classification models and supporting safer clinical deployment.
翻译:胸部X光诊断深度学习模型受到公开训练数据集中临床意义概念组合覆盖范围有限的制约。尽管合成图像生成已被探索用于提升数据多样性,但现有方法极少施加临床或解剖约束,限制了其在改善模型可靠性方面的效用。我们提出CARPA——一种具有临床感知与解剖基础的合成胸部X光生成框架,该框架在保留解剖结构的同时对临床概念向量施加定向扰动。通过生成具有受控概念插入与删除的解剖忠实合成图像,CARPA扩展了临床相关概念的覆盖范围。我们评估了CARPA在七种骨干架构上的表现:在合成子集上微调模型,并在保留的MIMIC-CXR基准上测试。与先前的概念扰动方法相比,基于CARPA生成图像的微调持续改善了精确召回性能、降低了预测不确定性并提升了模型校准度。结构及语义分析表明其具有高解剖保真度、强概念对齐度及低语义不确定性。两位影像学专家的评估进一步证实了其真实感与临床一致性。综上,这些结果表明解剖基础的概念扰动能够更有效地利用合成数据,提升胸部X光分类模型的性能与可靠性,并支持更安全的临床部署。