Time series in Electronic Health Records (EHRs) present unique challenges for generative models, such as irregular sampling, missing values, and high dimensionality. In this paper, we propose a novel generative adversarial network (GAN) model, TimEHR, to generate time series data from EHRs. In particular, TimEHR treats time series as images and is based on two conditional GANs. The first GAN generates missingness patterns, and the second GAN generates time series values based on the missingness pattern. Experimental results on three real-world EHR datasets show that TimEHR outperforms state-of-the-art methods in terms of fidelity, utility, and privacy metrics.
翻译:电子健康记录中的时间序列对生成模型提出了独特挑战,例如不规则采样、缺失值和高维度问题。本文提出了一种新型生成对抗网络模型TimEHR,用于生成电子健康记录中的时间序列数据。具体而言,TimEHR将时间序列视为图像,并基于两个条件GAN:第一个GAN生成缺失模式,第二个GAN根据缺失模式生成时间序列值。在三个真实电子健康记录数据集上的实验结果表明,TimEHR在保真度、实用性和隐私度量指标上均优于现有最佳方法。