Lexicase selection is a widely used parent selection algorithm in genetic programming, known for its success in various task domains such as program synthesis, symbolic regression, and machine learning. Due to its non-parametric and recursive nature, calculating the probability of each individual being selected by lexicase selection has been proven to be an NP-hard problem, which discourages deeper theoretical understanding and practical improvements to the algorithm. In this work, we introduce probabilistic lexicase selection (plexicase selection), a novel parent selection algorithm that efficiently approximates the probability distribution of lexicase selection. Our method not only demonstrates superior problem-solving capabilities as a semantic-aware selection method, but also benefits from having a probabilistic representation of the selection process for enhanced efficiency and flexibility. Experiments are conducted in two prevalent domains in genetic programming: program synthesis and symbolic regression, using standard benchmarks including PSB and SRBench. The empirical results show that plexicase selection achieves state-of-the-art problem-solving performance that is competitive to the lexicase selection, and significantly outperforms lexicase selection in computation efficiency.
翻译:词典选择是遗传编程中广泛使用的一种父代选择算法,因其在程序综合、符号回归及机器学习等多个任务领域中的成功而备受关注。由于其非参数化和递归性质,计算每个个体被词典选择选中的概率已被证明是一个NP难问题,这阻碍了对该算法的深入理论理解及实际改进。在本研究中,我们提出概率性词典选择(plexicase selection),一种新型父代选择算法,能够高效近似词典选择的概率分布。我们的方法不仅作为一种语义感知选择方法展现出卓越的问题求解能力,还通过选择过程的概率表示提升了效率与灵活性。我们在遗传编程的两个主流领域——程序综合与符号回归中,使用PSB和SRBench等标准基准进行了实验。实证结果表明,plexicase选择的问题求解性能达到与词典选择竞争的最新水平,且在计算效率上显著优于词典选择。