Many sequential decision-making tasks require satisfaction of multiple, partially contradictory objectives. Existing approaches are monolithic, namely all objectives are fulfilled using a single policy, which is a function that selects a sequence of actions. We present auction-based scheduling, a modular framework for multi-objective decision-making problems. Each objective is fulfilled using a separate policy, and the policies can be independently created, modified, and replaced. Understandably, different policies with conflicting goals may choose conflicting actions at a given time. In order to resolve conflicts, and compose policies, we employ a novel auction-based mechanism. We allocate a bounded budget to each policy, and at each step, the policies simultaneously bid from their available budgets for the privilege of being scheduled and choosing an action. Policies express their scheduling urgency using their bids and the bounded budgets ensure long-run scheduling fairness. We lay the foundations of auction-based scheduling using path planning problems on finite graphs with two temporal objectives. We present decentralized algorithms to synthesize a pair of policies, their initially allocated budgets, and bidding strategies. We consider three categories of decentralized synthesis problems, parameterized by the assumptions that the policies make on each other: (a) strong synthesis, with no assumptions and strongest guarantees, (b) assume-admissible synthesis, with weakest rationality assumptions, and (c) assume-guarantee synthesis, with explicit contract-based assumptions. For reachability objectives, we show that, surprisingly, decentralized assume-admissible synthesis is always possible when the out-degrees of all vertices are at most two.
翻译:许多顺序决策任务需要满足多个部分矛盾的目标。现有方法具有整体性,即所有目标通过单一策略(一种选择动作序列的函数)来实现。我们提出基于拍卖的调度——一种用于多目标决策问题的模块化框架。每个目标由独立策略实现,这些策略可独立创建、修改和替换。显然,具有冲突目标的不同策略可能在给定时间选择冲突的动作。为解决冲突并组合策略,我们采用了一种新颖的基于拍卖的机制。我们为每个策略分配有界预算,每一步中,各策略同时从其可用预算中为被调度并选择动作的权限进行竞拍。策略通过其出价表达调度紧迫性,而有界预算则确保长期调度的公平性。我们以具有两个时序目标的有限图路径规划问题为基础,奠定了基于拍卖的调度理论基础。我们提出用于合成一对策略、其初始分配预算及竞标策略的分布式算法。考虑了由策略对彼此所做的假设参数化的三类分布式合成问题:(a)强合成——无假设且保证最强;(b)假设可容许合成——假设最弱且具有合理性;(c)假设保证合成——基于明确契约的假设。针对可达性目标,我们惊人地发现:当所有顶点的出度不超过二时,分布式假设可容许合成始终可行。