Missing value imputation is a fundamental challenge in machine intelligence, heavily dependent on data completeness. Current imputation methods often handle numerical and categorical attributes independently, overlooking critical interdependencies among heterogeneous features. To address these limitations, we propose a novel imputation approach that explicitly models cross-type feature dependencies within a unified framework. Our method leverages both complete and incomplete instances to ensure accurate and consistent imputation in tabular data. Extensive experimental results demonstrate that the proposed approach achieves superior performance over existing techniques and significantly enhances downstream machine learning tasks, providing a robust solution for real-world systems with missing data.
翻译:缺失值填补是机器智能领域的一项基础性挑战,其效果高度依赖于数据的完整性。现有的填补方法通常独立处理数值型与分类型属性,忽略了异构特征间关键的内在关联。为克服这些局限,我们提出了一种新颖的填补方法,该方法在统一框架内显式建模跨类型特征依赖关系。我们的方法同时利用完整与不完整样本,以确保表格数据中填补结果的准确性与一致性。大量实验结果表明,所提方法相较现有技术取得了更优的性能,并显著提升了下游机器学习任务的效果,为现实世界中存在缺失数据的系统提供了鲁棒的解决方案。