Deep Neural Networks (DNNs) have revolutionized artificial intelligence, achieving impressive results on diverse data types, including images, videos, and texts. However, DNNs still lag behind Gradient Boosting Decision Trees (GBDT) on tabular data, a format extensively utilized across various domains. In this paper, we propose DOFEN, short for \textbf{D}eep \textbf{O}blivious \textbf{F}orest \textbf{EN}semble, a novel DNN architecture inspired by oblivious decision trees. DOFEN constructs relaxed oblivious decision trees (rODTs) by randomly combining conditions for each column and further enhances performance with a two-level rODT forest ensembling process. By employing this approach, DOFEN achieves state-of-the-art results among DNNs and further narrows the gap between DNNs and tree-based models on the well-recognized benchmark: Tabular Benchmark \citep{grinsztajn2022tree}, which includes 73 total datasets spanning a wide array of domains. The code of DOFEN is available at: \url{https://github.com/Sinopac-Digital-Technology-Division/DOFEN}.
翻译:深度神经网络(DNNs)在人工智能领域引发了革命性变革,在图像、视频和文本等多种数据类型上取得了令人瞩目的成果。然而,在表格数据这种广泛应用于各领域的格式上,DNNs的表现仍落后于梯度提升决策树(GBDT)。本文提出DOFEN(\textbf{D}eep \textbf{O}blivious \textbf{F}orest \textbf{EN}semble的缩写),这是一种受无意识决策树启发的新型DNN架构。DOFEN通过随机组合每列的条件来构建松弛无意识决策树(rODTs),并进一步通过两级rODT森林集成过程提升性能。采用该方法后,DOFEN在DNN中取得了最先进的成果,并在公认的基准测试——包含跨广泛领域总计73个数据集的Tabular Benchmark \citep{grinsztajn2022tree}上,进一步缩小了DNN与基于树的模型之间的差距。DOFEN的代码发布于:\url{https://github.com/Sinopac-Digital-Technology-Division/DOFEN}。