Epsilon-lexicase selection is a parent selection method in genetic programming that has been successfully applied to symbolic regression problems. Recently, the combination of random subsampling with lexicase selection significantly improved performance in other genetic programming domains such as program synthesis. However, the influence of subsampling on the solution quality of real-world symbolic regression problems has not yet been studied. In this paper, we propose down-sampled epsilon-lexicase selection which combines epsilon-lexicase selection with random subsampling to improve the performance in the domain of symbolic regression. Therefore, we compare down-sampled epsilon-lexicase with traditional selection methods on common real-world symbolic regression problems and analyze its influence on the properties of the population over a genetic programming run. We find that the diversity is reduced by using down-sampled epsilon-lexicase selection compared to standard epsilon-lexicase selection. This comes along with high hyperselection rates we observe for down-sampled epsilon-lexicase selection. Further, we find that down-sampled epsilon-lexicase selection outperforms the traditional selection methods on all studied problems. Overall, with down-sampled epsilon-lexicase selection we observe an improvement of the solution quality of up to 85% in comparison to standard epsilon-lexicase selection.
翻译:ε-字典选择是一种遗传编程中的父代选择方法,已成功应用于符号回归问题。近年来,随机子采样与字典选择的结合显著提升了其他遗传编程领域(如程序合成)的性能。然而,子采样对真实世界符号回归问题解质量的影响尚未得到研究。本文提出降采样ε-字典选择方法,将ε-字典选择与随机子采样相结合,以提升符号回归领域的性能。为此,我们在常见真实世界符号回归问题上比较了降采样ε-字典选择与传统选择方法,并分析了其在遗传编程运行过程中对种群特性的影响。研究发现,与标准ε-字典选择相比,使用降采样ε-字典选择会降低种群多样性,同时伴随我们观察到的降采样ε-字典选择的高超选择率。此外,我们发现降采样ε-字典选择在所有研究问题上的表现均优于传统选择方法。总体而言,与标准ε-字典选择相比,降采样ε-字典选择可将解质量提升高达85%。