In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it leads to suboptimal performance in complex datasets with rich features. To address this problem, we introduce Tabular Feature Weighting with Transformer, a novel feature weighting approach for tabular data. Our method adopts Transformer to capture complex feature dependencies and contextually assign appropriate weights to discrete and continuous features. Besides, we employ a reinforcement learning strategy to further fine-tune the weighting process. Our extensive experimental results across various real-world datasets and diverse downstream tasks show the effectiveness of TFWT and highlight the potential for enhancing feature weighting in tabular data analysis.
翻译:本文提出了一种新颖的特征加权方法,以解决现有表格数据特征处理方法的局限性。现有方法通常假设数据集中所有样本和特征具有同等重要性,这种简化处理方式忽略了每个特征的独特贡献,可能导致重要特征信息的丢失,进而在特征丰富的复杂数据集中表现欠佳。为解决此问题,我们引入了基于Transformer的表格特征加权方法(Tabular Feature Weighting with Transformer),这是针对表格数据的一种新型特征加权方法。该方法采用Transformer架构来捕获复杂的特征依赖关系,并根据上下文为离散和连续特征赋予相应权重。此外,我们运用强化学习策略进一步优化加权过程。在多个真实数据集和不同下游任务中的广泛实验结果表明,TFWT在表格数据分析中具有显著的特征加权效果,展现了提升特征加权性能的潜力。