Due to their efficiency and small size, decision trees and random forests are popular machine learning models used for classification on resource-constrained systems. In such systems, the available execution time for inference in a random forest might not be sufficient for a complete model execution. Ideally, the already gained prediction confidence should be retained. An anytime algorithm is designed to be able to be aborted anytime, while giving a result with an increasing quality over time. Previous approaches have realized random forests as anytime algorithms on the granularity of trees, stopping after some but not all trees of a forest have been executed. However, due to the way decision trees subdivide the sample space in every step, an increase in prediction quality is achieved with every additional step in one tree. In this paper, we realize decision trees and random forest as anytime algorithms on the granularity of single steps in trees. This approach opens a design space to define the step order in a forest, which has the potential to optimize the mean accuracy. We propose the Optimal Order, which finds a step order with a maximal mean accuracy in exponential runtime and the polynomial runtime heuristics Forward Squirrel Order and Backward Squirrel Order, which greedily maximize the accuracy for each additional step taken down and up the trees, respectively. Our evaluation shows, that the Backward Squirrel Order performs $\sim94\%$ as well as the Optimal Order and $\sim99\%$ as well as all other step orders.
翻译:由于决策树和随机森林具有高效性和体积小的特点,它们成为资源受限系统中用于分类的流行机器学习模型。在此类系统中,随机森林推理的可用执行时间可能不足以完成完整的模型执行。理想情况下,已获得的预测置信度应得以保留。任意时间算法被设计为能够在任何时间被中止,同时随时间推移提供质量不断提高的结果。先前的方法已在树的粒度上将随机森林实现为任意时间算法,在森林中部分而非全部树执行后停止。然而,由于决策树在每一步中细分样本空间的方式,在单棵树中每增加一个步骤都会提高预测质量。本文在树的单步粒度上实现决策树和随机森林作为任意时间算法。这种方法开辟了一个设计空间来定义森林中的步骤顺序,这具有优化平均准确率的潜力。我们提出了最优顺序,该顺序以指数运行时间找到具有最大平均准确率的步骤顺序;以及多项式运行时间启发式方法——前向松鼠顺序和后向松鼠顺序,它们分别通过贪婪地最大化沿树向下和向上每增加一个步骤的准确率。我们的评估表明,后向松鼠顺序的性能达到最优顺序的约94%,并达到所有其他步骤顺序的约99%。