While interests in tabular deep learning has significantly grown, conventional tree-based models still outperform deep learning methods. To narrow this performance gap, we explore the innovative retrieval mechanism, a methodology that allows neural networks to refer to other data points while making predictions. Our experiments reveal that retrieval-based training, especially when fine-tuning the pretrained TabPFN model, notably surpasses existing methods. Moreover, the extensive pretraining plays a crucial role to enhance the performance of the model. These insights imply that blending the retrieval mechanism with pretraining and transfer learning schemes offers considerable potential for advancing the field of tabular deep learning.
翻译:尽管表格深度学习的关注度显著增长,但传统的基于树模型仍优于深度学习方法。为缩小这一性能差距,我们探索了创新的检索机制——一种允许神经网络在预测时参考其他数据点的方法。实验表明,基于检索的训练,特别是在微调预训练TabPFN模型时,显著超越了现有方法。此外,大规模预训练对提升模型性能起着关键作用。这些发现表明,将检索机制与预训练及迁移学习方案相结合,为推进表格深度学习领域提供了巨大潜力。