Generating synthetic tabular health data is challenging, and evaluating their quality is equally, if not more, complex. This systematic review highlights the critical importance of rigorous evaluation of synthetic health data to ensure reliability, clinical relevance, and appropriate use. From an initial identification of 2067 relevant papers published in the last ten years, 134 studies were selected for detailed analysis. Our review identifies key challenges, including lack of consensus on evaluation methods, inconsistent application of evaluation metrics, limited involvement of domain experts, inadequate reporting of dataset characteristics, and limited reproducibility of results. In response, we provide a structured consolidation of synthetic data generation and evaluation methods into taxonomies, alongside practical guidelines to support more robust and standardised evaluation practices. These findings aim to support the responsible development and use of synthetic health data, aligned with emerging expectations around transparency, reproducibility, and governance, ultimately enabling the community to fully harness its transformative potential and accelerate innovation.
翻译:生成合成表格健康数据具有挑战性,而评估其质量同样复杂,甚至更为复杂。本系统综述强调了严格评估合成健康数据的关键重要性,以确保其可靠性、临床相关性和恰当应用。从初步识别过去十年间发表的2067篇相关论文中,筛选出134项研究进行详细分析。我们的综述识别了关键挑战,包括评估方法缺乏共识、评估指标应用不一致、领域专家参与有限、数据集特征报告不足以及结果可重复性差。为此,我们提供了将合成数据生成与评估方法结构化整合的分类体系,并附上实用指南,以支持更稳健和标准化的评估实践。这些发现旨在促进合成健康数据的负责任开发和使用,与透明度、可重复性和治理方面的新兴期望保持一致,最终帮助社区充分利用其变革潜力并加速创新。