Weighted timed games are two-player zero-sum games played in a timed automaton equipped with integer weights. We consider optimal reachability objectives, in which one of the players, that we call Min, wants to reach a target location while minimising the cumulated weight. While knowing if Min has a strategy to guarantee a value lower than a given threshold is known to be undecidable (with two or more clocks), several conditions, one of them being divergence, have been given to recover decidability. In such weighted timed games (like in untimed weighted games in the presence of negative weights), Min may need finite memory to play (close to) optimally. This is thus tempting to try to emulate this finite memory with other strategic capabilities. In this work, we allow the players to use stochastic decisions, both in the choice of transitions and of timing delays. We give a definition of the expected value in weighted timed games. We then show that, in divergent weighted timed games as well as in (untimed) weighted games (that we call shortest-path games in the following), the stochastic value is indeed equal to the classical (deterministic) value, thus proving that Min can guarantee the same value while only using stochastic choices, and no memory.
翻译:加权时间博弈是装备整数权重的时间自动机上的双人零和博弈。我们考虑最优可达性目标,其中名为Min的玩家希望以最小化累积权重的方式到达目标位置。虽然已知(在具有两个或更多时钟时)判断Min是否存在策略保证低于给定阈值的值是不可判定的,但已有若干条件(其中一项是散度性)可恢复可判定性。在此类加权时间博弈中(类似于存在负权重的未加权时间博弈),Min可能需要有限记忆来接近最优策略。因此,尝试用其他策略能力模拟这种有限记忆颇具吸引力。本研究中,我们允许玩家在转移选择和时延决策中均采用随机决策,给出加权时间博弈期望值的定义。进而证明:在散度性加权时间博弈及(未加权)加权博弈(下文称为最短路径博弈)中,随机值确实等于经典(确定性)值,从而证明Min仅需使用随机选择而无需记忆即可保证相同值。