We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the utility of AI actions, and also points towards increasing exploration as a potential improvement to such systems.
翻译:我们提出了一种通用方法,用于定量评估人工智能(AI)系统在面临有偏决策时的风险与脆弱性。该方法的核心指导原则是:任何AI算法都必须优于随机猜测者。这看似不言自明,但一项涉及轮盘赌的简化序列决策场景的实证结果表明,基于AI的复杂方法常常以显著差距落后于随机猜测者。我们指出,现代推荐系统可能同样倾向于过度偏好低风险选项。我们认为,这种“随机猜测者测试”可作为评估AI行动效用的有效工具,并指出增强探索性可能是改进此类系统的一个潜在方向。