This work examines an imbalance in artificial intelligence (AI) security research: the field tends to produce more work on attacking AI systems than on defending them. Drawing on related academic papers, we find biased attack-to-defense ratios across subfields, including federated learning, speech recognition, membership inference, large language models, etc. The imbalance possibly means far beyond a simple count: attack papers are routinely evaluated under favorable conditions that make threats look more severe than they are in practice, while defenses are held to a stricter standard that few can meet. The result is a literature rich in demonstrated vulnerabilities and thin on usable and deployed protections. We thus argue that AI security research should better incentivize defense research.
翻译:本研究探讨了人工智能安全研究中存在的不平衡现象:该领域倾向于产出更多关于攻击AI系统的研究,而非防御研究。通过梳理相关学术论文,我们发现联邦学习、语音识别、成员推断、大语言模型等子领域均存在偏向攻击的研究与防御研究的比例失调。这种不平衡的影响可能远超简单的数量统计:攻击类论文通常基于有利条件进行评估,使得威胁看起来比实际更严重,而防御类研究则面临更严格的标准,鲜有能达标者。这导致相关文献虽充斥着已论证的漏洞,但在可用且已部署的保护措施方面却十分薄弱。因此,我们认为AI安全研究应更有效地激励防御类研究。