In this article, we strengthen the proof methods of some previously weakly consistent variants of random forests into strongly consistent proof methods, and improve the data utilization of these variants, in order to obtain better theoretical properties and experimental performance. In addition, based on the multinomial random forest (MRF) and Bernoulli random forest (BRF), we propose a data-driven multinomial random forest (DMRF) algorithm, which has lower complexity than MRF and higher complexity than BRF while satisfying strong consistency. It has better performance in classification and regression problems than previous RF variants that only satisfy weak consistency, and in most cases even surpasses standard random forest. To the best of our knowledge, DMRF is currently the most excellent strongly consistent RF variant with low algorithm complexity
翻译:在本文中,我们将先前一些弱一致性随机森林变体的证明方法强化为强一致性证明方法,并提升了这些变体的数据利用率,从而获得更优的理论性质与实验性能。此外,基于多项式随机森林(MRF)和伯努利随机森林(BRF),我们提出了一种数据驱动多项式随机森林(DMRF)算法,该算法在满足强一致性的同时,具有比MRF更低的复杂度和比BRF更高的复杂度。与先前仅满足弱一致性的随机森林变体相比,它在分类与回归问题上表现更优,且在多数情况下甚至超越标准随机森林。据我们所知,DMRF是目前算法复杂度较低且最优越的强一致性随机森林变体。