External regret certifies stability only against replacing one's behavior by a fixed alternative. In a quantum game, this misses a natural physical move: a player can apply a local completely positive trace-preserving (CPTP) map to the state it actually received or prepared. We introduce coherent swap regret as the regret benchmark against all such local CPTP deviations, and give an algorithm achieving $O(\sqrt{dT\log d})$ coherent swap regret via entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule. The main result is a three-level deviation-class landscape. Replacement channels recover ordinary external regret at rate $Θ(\sqrt{T\log d})$. Unital channels, including unitary deviations and mixtures of unitaries, have zero minimax regret. Deterministic measurement-and-preparation channels already force $Ω(\sqrt{dT\log d})$ regret in the moderate-horizon regime, and this rate is also sufficient for all CPTP deviations. Thus the hardness comes from non-unital use of the recommendation register, not from quantum coherence alone. As an application, decentralized full-information learning in finite quantum games reaches an $\varepsilon$-approximate separable quantum correlated equilibrium after $T=O(\max_i d_i\log d_i/\varepsilon^2)$ rounds. We identify these equilibria with channel-proofness of mediated quantum recommendation protocols, give an SDP audit for local CPTP exploitability applicable to arbitrary finite-dimensional states, and include a probing-bandit extension with pseudo-regret $O(d^{4/3}T^{2/3}(\log d)^{1/3})$ under Haar-random pure-state probes.
翻译:外部后悔仅能保证相对于固定替代方案的行为稳定性。在量子博弈中,这一概念遗漏了一种自然的物理操作:玩家可对其实际接收或制备的态施加局域完全正迹保持(CPTP)映射。我们引入相干交换后悔作为针对所有此类局域CPTP偏离的后悔基准,并通过在CPTP Choi片上执行熵镜面上升结合不动点博弈规则,给出达到$O(\sqrt{dT\log d})$相干交换后悔的算法。主要结果呈现三级偏离分类景观:替代信道恢复普通外部后悔,收敛速率$Θ(\sqrt{T\log d})$;单位信道(包括酉偏离及其混合)的极小极大后悔为零;确定性测量-制备信道在中程时间尺度强制产生$Ω(\sqrt{dT\log d})$后悔,该速率对全部CPTP偏离而言亦为充分界。因此,困难源于对推荐寄存器的非单位使用,而非单纯的量子相干性。作为应用,有限量子博弈中的去中心化全信息学习经$T=O(\max_i d_i\log d_i/\varepsilon^2)$轮后达到$\varepsilon$近似可分离量子相关均衡。我们通过中介量子推荐协议的信道防护性识别这些均衡,给出适用于任意有限维态局域CPTP可剥削性的SDP审计方法,并扩展至探针-赌博机设置,在使用Haar随机纯态探针时实现伪后悔$O(d^{4/3}T^{2/3}(\log d)^{1/3})$。