The problem of mismatched guesswork considers the additional cost incurred by using a guessing function which is optimal for a distribution $q$ when the random variable to be guessed is actually distributed according to a different distribution $p$. This problem has been well-studied from an asymptotic perspective, but there has been little work on quantifying the difference in guesswork between optimal and suboptimal strategies for a finite number of symbols. In this non-asymptotic regime, we consider a definition for mismatched guesswork which we show is equivalent to a variant of the Kendall tau permutation distance applied to optimal guessing functions for the mismatched distributions. We use this formulation to bound the cost of guesswork under mismatch given a bound on the total variation distance between the two distributions.
翻译:错配猜谜问题考虑的是:当实际待猜随机变量服从分布$p$时,若采用对分布$q$最优的猜测函数,由此产生的额外代价。该问题已从渐近角度得到充分研究,但针对有限符号数下最优策略与次优策略猜谜代价差异的量化研究仍较为匮乏。在非渐近框架下,我们提出一种错配猜谜的定义,并证明该定义等价于基于错配分布最优猜测函数所对应的Kendall tau置换距离变体。利用这一公式化方法,我们可在两个分布总变差距离受约束的条件下,界定错配情形下的猜谜代价上界。