This paper presents enhancements to the projection pursuit tree classifier and visual diagnostic methods for assessing their impact in high dimensions. The original algorithm uses linear combinations of variables in a tree structure where depth is constrained to be less than the number of classes -- a limitation that proves too rigid for complex classification problems. Our extensions improve performance in multi-class settings with unequal variance-covariance structures and nonlinear class separations by allowing more splits and more flexible class groupings in the projection pursuit computation. Proposing algorithmic improvements is straightforward; demonstrating their actual utility is not. We therefore develop two visual diagnostic approaches to verify that the enhancements perform as intended. Using high-dimensional visualization techniques, we examine model fits on benchmark datasets to assess whether the algorithm behaves as theorized. An interactive web application enables users to explore the behavior of both the original and enhanced classifiers under controlled scenarios. The enhancements are implemented in the R package PPtreeExt.
翻译:本文提出了对投影寻踪树分类器的改进以及用于评估其在高维空间中影响的可视化诊断方法。原始算法在树结构中使用变量的线性组合,其中深度被限制为小于类别数量——这一限制对于复杂分类问题而言过于严格。我们的扩展通过允许在投影寻踪计算中进行更多分割和更灵活的类别分组,提升了算法在具有不等方差-协方差结构和非线性类别分离的多类别场景中的性能。提出算法改进是直接的;但证明其实际效用则不然。为此,我们开发了两种可视化诊断方法来验证改进效果符合预期。利用高维可视化技术,我们在基准数据集上检验模型拟合情况,以评估算法行为是否符合理论预期。一个交互式网络应用程序使用户能够在受控场景下探索原始分类器与增强分类器的行为。相关改进已在R软件包PPtreeExt中实现。