Rating systems play a crucial role in evaluating player skill across competitive environments. The Elo rating system, originally designed for deterministic and information-complete games such as chess, has been widely adopted and modified in various domains. However, the traditional Elo rating system only considers game outcomes for rating calculation and assumes uniform initial states across players. This raises important methodological challenges in skill modelling for popular partially randomized incomplete-information games such as Rummy. In this paper, we examine the limitations of conventional Elo ratings when applied to luck-driven environments and propose a modified Elo framework specifically tailored for Rummy. Our approach incorporates score-based performance metrics and explicitly models the influence of initial hand quality to disentangle skill from luck. Through extensive simulations involving 270,000 games across six strategies of varying sophistication, we demonstrate that our proposed system achieves stable convergence, superior discriminative power, and enhanced predictive accuracy compared to traditional Elo formulations. The framework maintains computational simplicity while effectively capturing the interplay of skill, strategy, and randomness, with broad applicability to other stochastic competitive environments.
翻译:评级系统在评估竞争环境中的玩家技能方面发挥着关键作用。Elo评级系统最初为国际象棋等确定性且信息完整的游戏设计,现已在多个领域被广泛采用和修改。然而,传统的Elo评级系统仅考虑游戏结果进行评级计算,并假设玩家初始状态一致。这为流行部分随机化不完全信息游戏(如Rummy)的技能建模带来了重要的方法论挑战。本文探讨了传统Elo评级应用于运气驱动环境时的局限性,并提出了一种专门为Rummy定制的改进Elo框架。我们的方法结合了基于分数的性能指标,并显式建模初始手牌质量的影响,以区分技能与运气成分。通过涉及六种不同复杂度策略的270,000局游戏进行广泛模拟,我们证明相较于传统Elo公式,所提出的系统实现了稳定的收敛性、更优的判别能力和更高的预测准确性。该框架在保持计算简洁性的同时,有效捕捉了技能、策略与随机性的相互作用,对其他随机竞争环境具有广泛适用性。