Many sequential decision-making tasks require satisfaction of multiple, partially contradictory objectives. Existing approaches are monolithic, namely all objectives are fulfilled using a single policy, which is a function that selects a sequence of actions. We present auction-based scheduling, a modular framework for multi-objective decision-making problems. Each objective is fulfilled using a separate policy, and the policies can be independently created, modified, and replaced. Understandably, different policies with conflicting goals may choose conflicting actions at a given time. In order to resolve conflicts, and compose policies, we employ a novel auction-based mechanism. We allocate a bounded budget to each policy, and at each step, the policies simultaneously bid from their available budgets for the privilege of being scheduled and choosing an action. Policies express their scheduling urgency using their bids and the bounded budgets ensure long-run scheduling fairness. We lay the foundations of auction-based scheduling using path planning problems on finite graphs with two temporal objectives. We present decentralized algorithms to synthesize a pair of policies, their initially allocated budgets, and bidding strategies. We consider three categories of decentralized synthesis problems, parameterized by the assumptions that the policies make on each other: (a) strong synthesis, with no assumptions and strongest guarantees, (b) assume-admissible synthesis, with weakest rationality assumptions, and (c) assume-guarantee synthesis, with explicit contract-based assumptions. For reachability objectives, we show that, surprisingly, decentralized assume-admissible synthesis is always possible when the out-degrees of all vertices are at most two.
翻译:许多顺序决策任务需要满足多个部分矛盾的目标。现有方法是整体式的,即所有目标通过单一策略实现,该策略是选择一系列动作的函数。我们提出基于拍卖的调度,这是一种用于多目标决策问题的模块化框架。每个目标由独立的策略实现,且这些策略可以独立创建、修改和替换。可以理解,具有冲突目标的不同策略可能在给定时间选择冲突的动作。为解决冲突并组合策略,我们采用了一种新颖的基于拍卖的机制。我们为每个策略分配一个有界预算,在每一步中,各策略同时从其可用预算中出价,以争取被调度并选择动作的优先权。策略通过其出价表达调度紧迫性,而有界预算确保了调度的长期公平性。我们以具有两个时间目标的有限图路径规划问题为基础,奠定了基于拍卖的调度理论基础。我们提出了分散式算法来合成一对策略及其初始分配预算和出价策略。我们考虑了三类分散式综合问题,根据策略对彼此所做的假设进行参数化:(a) 强综合(无假设且保证最强),(b) 假设可接受综合(最弱理性假设),以及(c) 假设保证综合(基于显式契约的假设)。对于可达性目标,我们惊奇地发现,当所有顶点的出度最多为2时,分散式假设可接受综合总是可行的。