The rapid proliferation of data across diverse fields has accentuated the importance of accurate imputation for missing values. This task is crucial for ensuring data integrity and deriving meaningful insights. In response to this challenge, we present Xputer, a novel imputation tool that adeptly integrates Non-negative Matrix Factorization (NMF) with the predictive strengths of XGBoost. One of Xputer's standout features is its versatility: it supports zero imputation, enables hyperparameter optimization through Optuna, and allows users to define the number of iterations. For enhanced user experience and accessibility, we have equipped Xputer with an intuitive Graphical User Interface (GUI) ensuring ease of handling, even for those less familiar with computational tools. In performance benchmarks, Xputer not only rivals the computational speed of established tools such as IterativeImputer but also often outperforms them in terms of imputation accuracy. Furthermore, Xputer autonomously handles a diverse spectrum of data types, including categorical, continuous, and Boolean, eliminating the need for prior preprocessing. Given its blend of performance, flexibility, and user-friendly design, Xputer emerges as a state-of-the-art solution in the realm of data imputation.
翻译:各领域数据的快速激增,使得准确填补缺失值的重要性日益凸显。这一任务对于确保数据完整性和挖掘有效洞察至关重要。针对这一挑战,我们提出Xputer——一种创新的填补工具,它将非负矩阵分解(NMF)与XGBoost的预测能力巧妙结合。Xputer的突出特色在于其多功能性:支持零值填补、通过Optuna实现超参数优化,并允许用户自定义迭代次数。为提升用户体验与可及性,我们为Xputer配备了直观的图形用户界面(GUI),确保即使对计算工具不太熟悉的用户也能轻松操作。在性能基准测试中,Xputer不仅与诸如IterativeImputer等成熟工具的计算速度相当,且在填补精度上往往更胜一筹。此外,Xputer能够自主处理包括类别型、连续型和布尔型在内的多种数据类型,无需事先预处理。凭借其性能、灵活性与用户友好设计的完美结合,Xputer正成为数据填补领域的前沿解决方案。