The Narrative Continuity Test: A Conceptual Framework for Evaluating Identity Persistence in AI Systems

Artificial intelligence systems based on large language models (LLMs) can now generate coherent text, music, and images, yet they operate without a persistent state: each inference reconstructs context from scratch. This paper introduces the Narrative Continuity Test (NCT) -- a conceptual framework for evaluating identity persistence and diachronic coherence in AI systems. Unlike capability benchmarks that assess task performance, the NCT examines whether an LLM remains the same interlocutor across time and interaction gaps. The framework defines five necessary axes -- Situated Memory, Goal Persistence, Autonomous Self-Correction, Stylistic & Semantic Stability, and Persona/Role Continuity -- and explains why current architectures systematically fail to support them. Case analyses (Character.AI, Grok, Replit, Air Canada) show predictable continuity failures under stateless inference. The NCT reframes AI evaluation from performance to persistence, outlining conceptual requirements for future benchmarks and architectural designs that could sustain long-term identity and goal coherence in generative models.

翻译：基于大语言模型（LLMs）的人工智能系统现已能够生成连贯的文本、音乐和图像，但其运行缺乏持久状态：每次推理均需从头重建上下文。本文提出叙事连续性测试（NCT）——一个用于评估人工智能系统身份持久性与历时一致性的概念框架。与评估任务性能的能力基准不同，NCT检验大语言模型在跨越时间与交互间隙时是否保持为同一对话主体。该框架定义了五个必要维度——情境记忆、目标持久性、自主纠错能力、风格与语义稳定性、以及角色/身份连续性——并阐释了当前架构为何系统性地无法支持这些维度。案例分析（Character.AI、Grok、Replit、Air Canada）展示了无状态推理下可预测的连续性失效现象。NCT将人工智能评估从性能导向重构为持久性导向，为未来基准测试和架构设计勾勒出概念性要求，以期在生成模型中维持长期身份与目标一致性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日