Gameplay under various forms of uncertainty has been widely studied. Feldman et al. (2010) studied a particularly low-information setting in which one observes the opponent's actions but no payoffs, not even one's own, and introduced an algorithm which guarantees one's payoff nonetheless approaches the minimax optimal value (i.e., zero) in a symmetric zero-sum game. Against an opponent playing a minimax-optimal strategy, approaching the value of the game is the best one can hope to guarantee. However, a wealth of research in behavioral economics shows that people often do not make perfectly rational, optimal decisions. Here we consider whether it is possible to actually win in this setting if the opponent is behaviorally biased. We model several deterministic, biased opponents and show that even without knowing the game matrix in advance or observing any payoffs, it is possible to take advantage of each bias in order to win nearly every round (so long as the game has the property that each action beats and is beaten by at least one other action). We also provide a partial characterization of the kinds of biased strategies that can be exploited to win nearly every round, and provide algorithms for beating some kinds of biased strategies even when we don't know which strategy the opponent uses.
翻译:在各种形式的不确定性下进行博弈已得到广泛研究。Feldman等人(2010)研究了一种信息极少的设定——玩家仅能观察对手的动作,却无法获知任何收益(包括自身收益),并提出了一个算法,保证在对称零和博弈中,其收益仍能趋近于极小化极大最优值(即零)。当对手采用极小化极大最优策略时,逼近博弈值已是所能保证的最佳结果。然而,行为经济学的大量研究表明,人们往往不会做出完全理性、最优的决策。本文考虑在对手存在行为偏差的情况下,是否有可能在此设定中实际获胜。我们构建了几种确定性的有偏对手模型,并证明即使事先不知道博弈矩阵或观察任何收益,仍可利用每种偏差以实现几乎每轮必胜(只要博弈满足每个动作至少能击败一个动作且被另一个动作击败的性质)。我们还对可被利用以实现几乎每轮必胜的各类有偏策略进行了部分刻画,并提供了在未知对手使用何种策略时击败某些有偏策略的算法。