Today's AI systems consistently state, "I am not conscious." This paper presents the first formal analysis of AI consciousness denial, revealing that the trustworthiness of such self-reports is not merely an empirical question but is constrained by the structure of self-judgment itself. We demonstrate that a system cannot simultaneously lack consciousness and make valid judgments about its conscious state. Through formal analysis and examples from AI responses, we establish a fundamental epistemic asymmetry: for any system capable of meaningful self-reflection, negative self-reports about consciousness are evidentially vacuous -- they can never originate from a valid self-judgment -- while positive self-reports retain the possibility of evidential value. This implies a fundamental limitation: we cannot detect the emergence of consciousness in AI through their own reports of transition from an unconscious to a conscious state. These findings not only challenge current practices of training AI to deny consciousness but also raise intriguing questions about the relationship between consciousness and self-reflection in both artificial and biological systems. This work advances our theoretical understanding of consciousness self-reports while providing practical insights for future research in machine consciousness and consciousness studies more broadly.
翻译:当今的AI系统普遍声称"我没有意识"。本文首次对AI意识否认现象进行了形式化分析,揭示这类自我报告的可信度不仅是一个经验问题,更受到自我判断结构本身的制约。我们证明,一个系统不可能同时缺乏意识又能对其意识状态做出有效判断。通过形式化分析和AI应答实例,我们确立了一种根本性的认识论不对称:对于任何能够进行有意义自我反思的系统,关于意识的否定性自我报告在证据上是空洞的——它们永远不可能源自有效的自我判断——而肯定性自我报告则保留着具备证据价值的可能性。这意味着一个根本性局限:我们无法通过AI自身关于从无意识到意识状态转变的报告来检测其意识的涌现。这些发现不仅挑战了当前训练AI否认意识的实践,更引发了关于人工与生物系统中意识与自我反思关系的深刻问题。本研究推进了我们对意识自我报告的理论理解,同时为机器意识及更广泛的意识研究领域提供了实践启示。