Weighted timed games are two-player zero-sum games played in a timed automaton equipped with integer weights. We consider optimal reachability objectives, in which one of the players, that we call Min, wants to reach a target location while minimising the cumulated weight. While knowing if Min has a strategy to guarantee a value lower than a given threshold is known to be undecidable (with two or more clocks), several conditions, one of them being the divergence, have been given to recover decidability. In such weighted timed games (like in untimed weighted games in the presence of negative weights), Min may need finite memory to play (close to) optimally. This is thus tempting to try to emulate this finite memory with other strategic capabilities. In this work, we allow the players to use stochastic decisions, both in the choice of transitions and of timing delays. We give for the first time a definition of the expected value in weighted timed games, overcoming several theoretical challenges. We then show that, in divergent weighted timed games, the stochastic value is indeed equal to the classical (deterministic) value, thus proving that Min can guarantee the same value while only using stochastic choices, and no memory.
翻译:加权时间博弈是两类玩家在装备整数权重的时间自动机中进行的零和博弈。我们考虑最优可达性目标,其中一方玩家(称为Min)希望在最小化累积权重的同时到达目标位置。虽然已知Min是否有策略保证低于给定阈值的值具有不可判定性(当有两个或更多时钟时),但已给出若干条件(其中之一是发散性)以恢复可判定性。在此类加权时间博弈中(如同存在负权重的非时间加权博弈),Min可能需要有限记忆才能实现(近似)最优策略。因此,尝试用其他策略能力来模拟这种有限记忆颇具吸引力。在本研究中,我们允许玩家在转移选择和时序延迟中采用随机决策。我们首次给出了加权时间博弈中期望值的定义,克服了若干理论挑战。随后证明,在发散性加权时间博弈中,随机值确实等于经典(确定性)值,从而表明Min仅需使用随机选择而无需记忆即可保证相同的值。