Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.
翻译:确保人工智能系统满足正式安全性和策略约束是安全关键领域中的核心挑战。虽然验证的局限性常被归因于组合复杂性和模型表达能力,但我们证明这些局限性源于内在的信息论极限。我们将策略遵从性形式化为一个关于编码系统行为的验证问题,并使用科莫戈罗夫复杂性对其进行分析。我们证明了一个不完备性结果:对于任何固定的可判定可计算列举验证器,存在一个阈值,超过该阈值后,一旦真实策略遵从实例的复杂性超过该阈值,这些实例便无法被认证。因此,任何有限的正式验证器都无法认证所有具有任意高复杂性的策略遵从实例。这揭示了AI安全验证中一种独立于计算资源的基本局限性,并推动了提供实例级正确性保证的证明承载方法的发展。