In this contribution, we discuss the basic concepts of genotypes and phenotypes in tree-based GP (TGP), and then analyze their behavior using five benchmark datasets. We show that TGP exhibits the same behavior that we can observe in other GP representations: At the genotypic level trees show frequently unchecked growth with seemingly ineffective code, but on the phenotypic level, much smaller trees can be observed. To generate phenotypes, we provide a unique technique for removing semantically ineffective code from GP trees. The approach extracts considerably simpler phenotypes while not being limited to local operations in the genotype. We generalize this transformation based on a problem-independent parameter that enables a further simplification of the exact phenotype by coarse-graining to produce approximate phenotypes. The concept of these phenotypes (exact and approximate) allows us to clarify what evolved solutions truly predict, making GP models considered at the phenotypic level much better interpretable.
翻译:本文探讨了树型遗传编程(TGP)中基因型与表现型的基本概念,并利用五个基准数据集分析了二者的行为特征。研究表明,TGP呈现出与其他遗传编程表示相同的特性:在基因型层面,树结构常出现显著的非受控增长并包含大量看似无效的代码;而在表现型层面,则可观察到规模小得多的树结构。为生成表现型,我们提出了一种独特技术,用于移除遗传编程树中语义无效的代码。该方法能够提取出显著简化的表现型,且不受限于基因型层面的局部操作。我们基于与问题无关的参数对该变换过程进行泛化,通过粗粒化方法进一步简化精确表现型,从而生成近似表现型。精确与近似表现型的概念使我们能够阐明进化解的真实预测机制,使得基于表现型层面考虑的遗传编程模型具备更好的可解释性。