The ubiquitous missing values cause the multivariate time series data to be partially observed, destroying the integrity of time series and hindering the effective time series data analysis. Recently deep learning imputation methods have demonstrated remarkable success in elevating the quality of corrupted time series data, subsequently enhancing performance in downstream tasks. In this paper, we conduct a comprehensive survey on the recently proposed deep learning imputation methods. First, we propose a taxonomy for the reviewed methods, and then provide a structured review of these methods by highlighting their strengths and limitations. We also conduct empirical experiments to study different methods and compare their enhancement for downstream tasks. Finally, the open issues for future research on multivariate time series imputation are pointed out. All code and configurations of this work, including a regularly maintained multivariate time series imputation paper list, can be found in the GitHub repository~\url{https://github.com/WenjieDu/Awesome\_Imputation}.
翻译:数据缺失普遍会导致多元时间序列数据被部分观测,破坏时间序列的完整性,并阻碍有效的时间序列数据分析。近年来,深度学习插补方法在提升受损时间序列数据质量方面展现出显著成效,进而增强了下游任务的性能。本文对近期提出的深度学习插补方法进行了系统性综述。首先,我们提出针对所评述方法的分类体系,并通过强调其优势与局限性对这些方法进行结构化总结。此外,我们开展实证实验以研究不同方法并比较其在下游任务中的性能提升效果。最后,指出多元时间序列插补未来研究中待解决的关键问题。本文所有代码与配置信息,包括定期维护的多元时间序列插补论文列表,均可在GitHub仓库中找到\url{https://github.com/WenjieDu/Awesome\_Imputation}。