Many high-stakes decision-making problems, such as those found within cybersecurity and economics, can be modeled as competitive resource allocation games. In these games, multiple players must allocate limited resources to overcome their opponent(s), while minimizing any induced individual losses. However, existing means of assessing the performance of resource allocation algorithms are highly disparate and problem-dependent. As a result, evaluating such algorithms is unreliable or impossible in many contexts and applications, especially when considering differing levels of feedback. To resolve this problem, we propose a generalized definition of payoff which uses an arbitrary user-provided function. This unifies performance evaluation under all contexts and levels of feedback. Using this definition, we develop metrics for evaluating player performance, and estimators to approximate them under uncertainty (i.e., bandit or semi-bandit feedback). These metrics and their respective estimators provide a problem-agnostic means to contextualize and evaluate algorithm performance. To validate the accuracy of our estimator, we explore the Colonel Blotto ($\mathcal{CB}$) game as an example. To this end, we propose a graph-pruning approach to efficiently identify feasible opponent decisions, which are used in computing our estimation metrics. Using various resource allocation algorithms and game parameters, a suite of $\mathcal{CB}$ games are simulated and used to compute and evaluate the quality of our estimates. These simulations empirically show our approach to be highly accurate at estimating the metrics associated with the unseen outcomes of an opponent's latent behavior.
翻译:许多高风险决策问题,例如网络安全和经济学中的问题,都可以建模为竞争性资源分配博弈。在这些博弈中,多个玩家必须分配有限资源以击败对手,同时最小化自身引发的个体损失。然而,现有评估资源分配算法性能的方法高度多样化且依赖于具体问题。因此,在许多情境和应用中(尤其是在考虑不同反馈层级时),此类算法的评估不可靠甚至不可行。为解决这一问题,我们提出了一种广义的收益定义,该定义使用任意用户提供的函数。这统一了所有情境和反馈层级下的性能评估。基于这一定义,我们开发了评估玩家性能的指标,以及在不完全信息(即赌博机或半赌博机反馈)下近似这些指标的估计量。这些指标及其对应的估计量提供了一种与问题无关的方法来关联和评估算法性能。为验证估计量的准确性,我们以Colonel Blotto($\mathcal{CB}$)博弈为例。为此,我们提出了一种图剪枝方法,用于高效识别可行的对手决策,这些决策将用于计算我们的估计指标。通过使用多种资源分配算法和博弈参数,模拟了一系列$\mathcal{CB}$博弈,并计算和评估了估计量的质量。数值实验表明,我们的方法在估计与对手潜在行为未知结果相关的指标方面具有高精度。