With apparently all research on estimation-of-distribution algorithms (EDAs) concentrated on pseudo-Boolean optimization and permutation problems, we undertake the first steps towards using EDAs for problems in which the decision variables can take more than two values, but which are not permutation problems. To this aim, we propose a natural way to extend the known univariate EDAs to such variables. Different from a naive reduction to the binary case, it avoids additional constraints. Since understanding genetic drift is crucial for an optimal parameter choice, we extend the known quantitative analysis of genetic drift to EDAs for multi-valued variables. Roughly speaking, when the variables take $r$ different values, the time for genetic drift to become significant is $r$ times shorter than in the binary case. Consequently, the update strength of the probabilistic model has to be chosen $r$ times lower now. To investigate how desired model updates take place in this framework, we undertake a mathematical runtime analysis on the $r$-valued LeadingOnes problem. We prove that with the right parameters, the multi-valued UMDA solves this problem efficiently in $O(r\log(r)^2 n^2 \log(n))$ function evaluations. Overall, our work shows that EDAs can be adjusted to multi-valued problems, and it gives advice on how to set the main parameters.
翻译:尽管所有关于分布估计算法的研究显然都集中于伪布尔优化和排列问题,我们迈出了将EDAs应用于决策变量可取多于两个值(但非排列问题)的问题的第一步。为此,我们提出了一种自然方式将已知的单变量EDAs扩展至此类变量。不同于简化为二元情况的朴素方法,该方法避免了额外约束。鉴于理解遗传漂变对最优参数选择至关重要,我们将遗传漂变的已知定量分析扩展至针对多值变量的EDAs。大致而言,当变量可取$r$个不同值时,遗传漂变变得显著所需的时间比二元情形短$r$倍。因此,概率模型的更新强度现在必须降低$r$倍。为研究在此框架中期望的模型更新如何发生,我们对$r$值LeadingOnes问题进行了数学运行时间分析。我们证明,在正确参数设置下,多值UMDA能在$O(r\log(r)^2 n^2 \log(n))$次函数评估中高效求解该问题。总体而言,我们的工作表明EDAs可适用于多值问题,并就主要参数设置提供了指导建议。