We propose a general approach to quantitatively assessing the risk and vulnerability of artificial intelligence (AI) systems to biased decisions. The guiding principle of the proposed approach is that any AI algorithm must outperform a random guesser. This may appear trivial, but empirical results from a simplistic sequential decision-making scenario involving roulette games show that sophisticated AI-based approaches often underperform the random guesser by a significant margin. We highlight that modern recommender systems may exhibit a similar tendency to favor overly low-risk options. We argue that this "random guesser test" can serve as a useful tool for evaluating the rationality of AI actions, and also points towards increasing exploration as a potential improvement to such systems.
翻译:我们提出了一种通用方法,用于定量评估人工智能(AI)系统在面临有偏决策时的风险与脆弱性。该方法的核心指导原则是:任何AI算法的表现都必须优于随机猜测者。这看似简单,但通过一个涉及轮盘赌的简化序列决策场景的实证结果发现,基于AI的复杂方法常常显著逊色于随机猜测者。我们指出,现代推荐系统可能同样存在过度偏好低风险选项的倾向。我们认为,这种“随机猜测者测试”可作为评估AI行为理性的有效工具,同时也指向了增加探索性作为此类系统潜在改进方向的可能性。