We present an analysis of the loss of population-level test coverage induced by different down-sampling strategies when combined with lexicase selection. We study recorded populations from the first generation of genetic programming runs, as well as entirely synthetic populations. Our findings verify the hypothesis that informed down-sampling better maintains population-level test coverage when compared to random down-sampling. Additionally, we show that both forms of down-sampling cause greater test coverage loss than standard lexicase selection with no down-sampling. However, given more information about the population, we found that informed down-sampling can further reduce its test coverage loss. We also recommend wider adoption of the static population analyses we present in this work.
翻译:我们分析了不同下采样策略与词典选择相结合时所导致的群体级测试覆盖损失。我们研究了遗传编程运行第一代的记录群体以及完全合成的群体。研究结果验证了以下假设:与随机下采样相比,有信息下采样能更好地维持群体级测试覆盖。此外,我们表明,两种下采样形式都比无下采样的标准词典选择造成更大的测试覆盖损失。然而,在获得更多群体信息的情况下,我们发现,有信息下采样能进一步减少其测试覆盖损失。我们还建议,更广泛地采用本工作中提出的静态群体分析。