A phylogeny describes the evolutionary history of an evolving population. Evolutionary search algorithms can perfectly track the ancestry of candidate solutions, illuminating a population's trajectory through the search space. However, phylogenetic analyses are typically limited to post-hoc studies of search performance. We introduce phylogeny-informed subsampling, a new class of subsampling methods that exploit runtime phylogenetic analyses for solving test-based problems. Specifically, we assess two phylogeny-informed subsampling methods -- individualized random subsampling and ancestor-based subsampling -- on three diagnostic problems and ten genetic programming (GP) problems from program synthesis benchmark suites. Overall, we found that phylogeny-informed subsampling methods enable problem-solving success at extreme subsampling levels where other subsampling methods fail. For example, phylogeny-informed subsampling methods more reliably solved program synthesis problems when evaluating just one training case per-individual, per-generation. However, at moderate subsampling levels, phylogeny-informed subsampling generally performed no better than random subsampling on GP problems. Our diagnostic experiments show that phylogeny-informed subsampling improves diversity maintenance relative to random subsampling, but its effects on a selection scheme's capacity to rapidly exploit fitness gradients varied by selection scheme. Continued refinements of phylogeny-informed subsampling techniques offer a promising new direction for scaling up evolutionary systems to handle problems with many expensive-to-evaluate fitness criteria.
翻译:系统发育描述了进化种群的演化历史。进化搜索算法可以完美追踪候选解的祖先关系,揭示种群在搜索空间中的轨迹。然而,系统发育分析通常仅限于搜索性能的事后研究。我们提出了系统发育信息子采样这一新类子采样方法,通过利用运行时系统发育分析来解决基于测试问题。具体而言,我们在三个诊断问题及程序合成基准套件中的十个遗传编程问题上,评估了两种系统发育信息子采样方法——个体随机子采样和基于祖先子采样。总体而言,我们发现系统发育信息子采样方法能够在其他子采样方法失败的极端子采样水平上实现问题求解成功。例如,当每代每个个体仅评估一个训练案例时,系统发育信息子采样方法更可靠地解决了程序合成问题。然而,在中度子采样水平下,系统发育信息子采样在遗传编程问题上的表现通常不优于随机子采样。我们的诊断实验表明,与随机子采样相比,系统发育信息子采样能改善多样性维持,但其对选择方案快速利用适应度梯度能力的影响因选择方案而异。系统发育信息子采样技术的持续改进,为扩展进化系统以处理具有大量昂贵评估适应度准则的问题提供了有前景的新方向。