Nowadays, many modern applications require heterogeneous tabular data, which is still a challenging task in terms of regression and classification. Many approaches have been proposed to adapt neural networks for this task, but still, boosting and bagging of decision trees are the best-performing methods for this task. In this paper, we show that a binomial initialized neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We also show that this initializing schema can be used to jointly train ensembles by adding gradient masking to batch entries and using the binomial initialization for the last layer in a neural network. For this purpose, we modified the hinge binary loss and the soft max loss to make them applicable for joint ensemble training. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches. In addition, we discuss the limitations and possible further research of our approach for improving the applicability of neural networks to tabular data. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FInitializationNeuronalNetworksTabularData&mode=list
翻译:当今许多现代应用需要处理异构表格数据,这在回归和分类任务中仍具挑战性。尽管已有多种方法尝试使神经网络适应此类任务,但基于决策树的提升与装袋方法仍是该领域表现最佳的方案。本文证明二项式初始化的神经网络可有效处理表格数据。所提方法提出了一种简单而有效的神经网络首隐层初始化方案。我们进一步表明,通过向批次条目添加梯度掩码并采用二项式初始化神经网络末层,该初始化范式可用于联合训练集成模型。为此,我们修改了合页二元损失函数和softmax损失函数,使其适用于联合集成训练。我们在多个公开数据集上评估该方法,并展示了相较于其他神经网络方法的性能提升。此外,我们讨论了该方法的局限性及提升神经网络在表格数据上适用性的未来研究方向。链接:https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FInitializationNeuronalNetworksTabularData&mode=list