In this study, we introduce a sophisticated generative conditional strategy designed to impute missing values within datasets, an area of considerable importance in statistical analysis. Specifically, we initially elucidate the theoretical underpinnings of the Generative Conditional Missing Imputation Networks (GCMI), demonstrating its robust properties in the context of the Missing Completely at Random (MCAR) and the Missing at Random (MAR) mechanisms. Subsequently, we enhance the robustness and accuracy of GCMI by integrating a multiple imputation framework using a chained equations approach. This innovation serves to bolster model stability and improve imputation performance significantly. Finally, through a series of meticulous simulations and empirical assessments utilizing benchmark datasets, we establish the superior efficacy of our proposed methods when juxtaposed with other leading imputation techniques currently available. This comprehensive evaluation not only underscores the practicality of GCMI but also affirms its potential as a leading-edge tool in the field of statistical data analysis.
翻译:本研究提出了一种精密的生成式条件策略,旨在填补数据集中的缺失值,这一领域在统计分析中具有相当的重要性。具体而言,我们首先阐述了生成式条件缺失值填补网络(GCMI)的理论基础,证明了其在完全随机缺失(MCAR)和随机缺失(MAR)机制下的稳健特性。随后,我们通过整合基于链式方程的多重填补框架,增强了GCMI的鲁棒性和准确性。这一创新显著提升了模型的稳定性并改善了填补性能。最后,通过一系列利用基准数据集进行的细致模拟和实证评估,我们证明了所提方法相较于当前其他主流填补技术的卓越效能。这一全面评估不仅凸显了GCMI的实用性,也肯定了其作为统计数据分析领域前沿工具的潜力。