We consider problems in which a mobile robot samples an unknown function defined over its operating space, so as to find a global optimum of this function. The path traveled by the robot matters, since it influences energy and time requirements. We consider a branch-and-bound algorithm called deterministic optimistic optimization, and extend it to the path-aware setting, obtaining path-aware optimistic optimization (OOPA). In this new algorithm, the robot decides how to move next via an optimal control problem that maximizes the long-term impact of the robot trajectory on lowering the upper bound, weighted by bound and function values to focus the search on the optima. An online version of value iteration is used to solve an approximate version of this optimal control problem. OOPA is evaluated in extensive experiments in two dimensions, where it does better than path-unaware and local-optimization baselines.
翻译:我们考虑移动机器人在其运行空间内对未知函数进行采样以寻找全局最优的问题。机器人的移动路径至关重要,因为它会影响能量和时间需求。我们研究了一种名为确定性乐观优化的分支定界算法,并将其扩展至路径感知场景,提出了路径感知乐观优化(OOPA)。在该新算法中,机器人通过求解一个最优控制问题来决定下一步的移动方向,该问题旨在最大化机器人轨迹对降低上界值的长期影响,并依据界值与函数值进行加权以聚焦于最优解搜索。采用在线值迭代方法求解该最优控制问题的近似版本。通过在二维空间中的大量实验评估,OOPA表现优于路径无关优化和局部优化基线方法。