Experimental robot optimization often requires evaluating each candidate policy for seconds to minutes. The chosen evaluation time influences optimization because of a speed-accuracy tradeoff: shorter evaluations enable faster iteration, but are also more subject to noise. Here, we introduce a supplement to the CMA-ES optimization algorithm, named Adaptive Sampling CMA-ES (AS-CMA), which assigns sampling time to candidates based on predicted sorting difficulty, aiming to achieve consistent precision. We compared AS-CMA to CMA-ES and Bayesian optimization using a range of static sampling times in four simulated cost landscapes. AS-CMA converged on 98% of all runs without adjustment to its tunable parameter, and converged 24-65% faster and with 29-76% lower total cost than each landscape's best CMA-ES static sampling time. As compared to Bayesian optimization, AS-CMA converged more efficiently and reliably in complex landscapes, while in simpler landscapes, AS-CMA was less efficient but equally reliable. We deployed AS-CMA in an exoskeleton optimization experiment and found the optimizer's behavior was consistent with expectations. These results indicate that AS-CMA can improve optimization efficiency in the presence of noise while minimally affecting optimization setup complexity and tuning requirements.
翻译:机器人实验优化通常需要对每个候选策略进行数秒至数分钟的性能评估。由于存在速度-精度权衡,所选评估时长会影响优化效果:较短的评估可实现更快的迭代,但也更易受噪声干扰。本文提出一种CMA-ES优化算法的补充方案——自适应采样CMA-ES(AS-CMA),该算法根据预测的排序难度为候选策略分配采样时间,旨在实现稳定的评估精度。我们在四种模拟代价场景中,将AS-CMA与采用固定采样时长的CMA-ES及贝叶斯优化进行对比。AS-CMA在未调整可调参数的情况下实现了98%的总体收敛率,且收敛速度比各场景中最佳固定采样时长的CMA-ES快24-65%,总代价降低29-76%。与贝叶斯优化相比,AS-CMA在复杂场景中收敛效率更高、可靠性更强;在简单场景中效率较低但可靠性相当。我们将AS-CMA部署于外骨骼优化实验,发现优化器的行为符合预期。这些结果表明,AS-CMA能在噪声环境下提升优化效率,同时将优化设置复杂度与参数调整需求控制在最低水平。