The tasks that an autonomous agent is expected to perform are often optional or are incompatible with each other owing to the agent's limited actuation capabilities, specifically the dynamics and control input bounds. We encode tasks as time-dependent state constraints and leverage the advances in multi-objective optimization to formulate the problem of choosing tasks as selection of a feasible subset of constraints that can be satisfied for all time and maximizes a performance metric. We show that this problem, although amenable to reachability or mixed integer model predictive control-based analysis in the offline phase, is NP-Hard in general and therefore requires heuristics to be solved efficiently. When incompatibility in constraints is observed under a given policy that imposes task constraints at each time step in an optimization problem, we assign a Lagrange score to each of these constraints based on the variation in the corresponding Lagrange multipliers over the compatible time horizon. These scores are then used to decide the order in which constraints are dropped in a greedy strategy. We further employ a genetic algorithm to improve upon the greedy strategy. We evaluate our method on a robot waypoint following task when the low-level controllers that impose state constraints are described by Control Barrier Function-based Quadratic Programs and provide a comparison with waypoint selection based on knowledge of backward reachable sets.
翻译:自主代理需执行的任务常因执行能力有限(特别是动力学与控制输入约束)而存在互斥性。本文将任务编码为时变状态约束,利用多目标优化方法将任务选择问题建模为可兼容约束子集的选取,该子集须满足所有时间步的可行性约束并最大化性能指标。研究表明,尽管该问题可在离线阶段通过可达性分析或混合整数模型预测控制方法求解,但其本质上属于NP难问题,故需采用启发式算法实现高效求解。当优化问题中施加任务约束的既定策略导致约束互斥时,我们基于各约束在兼容时域内对应的拉格朗日乘子变化量,为每个约束分配拉格朗日得分。依据该得分可确定贪心策略中约束丢弃的优先级顺序,并进一步采用遗传算法优化贪心策略。我们将所提方法应用于机器人航点跟踪任务(其中施加状态约束的低层控制器由基于控制屏障函数的二次规划描述),并与基于后向可达集知识的航点选取方法进行对比验证。