Though numerous solvers have been proposed for the MaxSAT problem, and the benchmark environment such as MaxSAT Evaluations provides a platform for the comparison of the state-of-the-art solvers, existing assessments were usually evaluated based on the quality, e.g., fitness, of the best-found solutions obtained within a given running time budget. However, concerning solely the final obtained solutions regarding specific time budgets may restrict us from comprehending the behavior of the solvers along the convergence process. This paper demonstrates that Empirical Cumulative Distribution Functions can be used to compare MaxSAT local search solvers' anytime performance across multiple problem instances and various time budgets. The assessment reveals distinctions in solvers' performance and displays that the (dis)advantages of solvers adjust along different running times. This work also exhibits that the quantitative and high variance assessment of anytime performance can guide machines, i.e., automatic configurators, to search for better parameter settings. Our experimental results show that the hyperparameter optimization tool, i.e., SMAC, generally achieves better parameter settings of local search when using the anytime performance as the cost function, compared to using the fitness of the best-found solutions.
翻译:尽管针对MaxSAT问题已提出众多求解器,且MaxSAT评估等基准环境为最先进求解器的比较提供了平台,现有评估通常基于在给定运行时间预算内所获最佳解的质量(如适应度)进行。然而,仅关注特定时间预算下的最终解可能限制我们对求解器沿收敛过程的行为理解。本文证明,经验累积分布函数可用于跨多个问题实例及不同时间预算比较MaxSAT局部搜索求解器的任意时间性能。该评估揭示了求解器性能的差异,并显示求解器的(劣势)优势会随不同运行时间而调整。本文还表明,对任意时间性能进行量化且高方差的评估可引导机器(即自动配置器)搜索更优的参数设置。实验结果显示,当使用任意时间性能作为成本函数时,超参数优化工具SMAC通常能获得比使用最佳解适应度更好的局部搜索参数设置。