Weighted timed games are two-player zero-sum games played in a timed automaton equipped with integer weights. We consider optimal reachability objectives, in which one of the players, that we call Min, wants to reach a target location while minimising the cumulated weight. While knowing if Min has a strategy to guarantee a value lower than a given threshold is known to be undecidable (with two or more clocks), several conditions, one of them being the divergence, have been given to recover decidability. In such weighted timed games (like in untimed weighted games in the presence of negative weights), Min may need finite memory to play (close to) optimally. This is thus tempting to try to emulate this finite memory with other strategic capabilities. In this work, we allow the players to use stochastic decisions, both in the choice of transitions and of timing delays. We give for the first time a definition of the expected value in weighted timed games, overcoming several theoretical challenges. We then show that, in divergent weighted timed games, the stochastic value is indeed equal to the classical (deterministic) value, thus proving that Min can guarantee the same value while only using stochastic choices, and no memory.
翻译:加权时间博弈是在配备整数权重的时间自动机上进行的双人零和博弈。我们考虑最优可达性目标,其中一方(称为Min)希望以最小化累积权重的方式到达目标位置。虽然已知Min是否存在策略保证低于给定阈值的价值在具有两个或更多时钟时是不可判定的,但已经给出了若干条件(其中之一是发散性)来恢复可判定性。在这类加权时间博弈中(如同存在负权重的无时间加权博弈),Min可能需要有限记忆来实现(接近)最优策略。因此,尝试用其他策略能力模拟这种有限记忆颇具吸引力。本文允许玩家在转移选择和延迟定时两方面采用随机决策。我们首次给出加权时间博弈中期望值的定义,克服了若干理论挑战。随后证明,在发散加权时间博弈中,随机值确实等于经典(确定性)值,从而证明Min仅使用随机选择且无需记忆即可保证相同价值。