Until recently, the question of the effective inductive bias of deep models on tabular data has remained unanswered. This paper investigates the hypothesis that arithmetic feature interaction is necessary for deep tabular learning. To test this point, we create a synthetic tabular dataset with a mild feature interaction assumption and examine a modified transformer architecture enabling arithmetical feature interactions, referred to as AMFormer. Results show that AMFormer outperforms strong counterparts in fine-grained tabular data modeling, data efficiency in training, and generalization. This is attributed to its parallel additive and multiplicative attention operators and prompt-based optimization, which facilitate the separation of tabular samples in an extended space with arithmetically-engineered features. Our extensive experiments on real-world data also validate the consistent effectiveness, efficiency, and rationale of AMFormer, suggesting it has established a strong inductive bias for deep learning on tabular data. Code is available at https://github.com/aigc-apps/AMFormer.
翻译:直到最近,深度模型在表格数据上的有效归纳偏置问题仍未有定论。本文研究了算术特征交互对深度表格学习是否必要的假设。为验证此观点,我们创建了一个具有温和特征交互假设的合成表格数据集,并检验了一种支持算术特征交互的改进型Transformer架构,称之为AMFormer。结果表明,AMFormer在细粒度表格数据建模、训练数据效率及泛化能力方面均优于强基线模型。这归因于其并行的加法和乘法注意力算子以及基于提示的优化方法——这些机制通过算术工程化特征在扩展空间中促进表格样本的分离。我们在真实数据上的大量实验也验证了AMFormer的一致性有效性、高效性和合理性,表明其为表格数据的深度学习建立了强大的归纳偏置。代码开源于https://github.com/aigc-apps/AMFormer。