When developing empirical equations, domain experts require these to be accurate and adhere to physical laws. Often, constants with unknown units need to be discovered alongside the equations. Traditional unit-aware genetic programming (GP) approaches cannot be used when unknown constants with undetermined units are included. This paper presents a method for dimensional analysis that propagates unknown units as ''jokers'' and returns the magnitude of unit violations. We propose three methods, namely evolutive culling, a repair mechanism, and a multi-objective approach, to integrate the dimensional analysis in the GP algorithm. Experiments on datasets with ground truth demonstrate comparable performance of evolutive culling and the multi-objective approach to a baseline without dimensional analysis. Extensive analysis of the results on datasets without ground truth reveals that the unit-aware algorithms make only low sacrifices in accuracy, while producing unit-adherent solutions. Overall, we presented a promising novel approach for developing unit-adherent empirical equations.
翻译:在开发经验方程时,领域专家要求方程既精确又符合物理定律。通常,需要随方程一同发现具有未知单位的常数。当方程包含具有未确定单位的未知常数时,传统的单位感知遗传规划(GP)方法便无法使用。本文提出了一种量纲分析方法,该方法将未知单位作为“通配符”进行传播,并返回单位违规的程度。我们提出了三种方法,即进化剔除、修复机制和多目标方法,以将量纲分析整合到GP算法中。在具有基准真值的数据集上的实验表明,进化剔除和多目标方法的性能与未使用量纲分析的基线方法相当。对无基准真值的数 据集结果进行的广泛分析表明,单位感知算法在精度上仅做出微小牺牲,同时能产生符合单位约束的解。总体而言,我们提出了一种有前景的新方法,用于开发符合单位约束的经验方程。