Large Language Models (LLMs) excel at generating creative narratives but struggle with long-term coherence and emotional consistency in complex stories. To address this, we propose SCORE (Story Coherence and Retrieval Enhancement), a framework integrating three components: 1) Dynamic State Tracking (monitoring objects/characters via symbolic logic), 2) Context-Aware Summarization (hierarchical episode summaries for temporal progression), and 3) Hybrid Retrieval (combining TF-IDF keyword relevance with cosine similarity-based semantic embeddings). The system employs a temporally-aligned Retrieval-Augmented Generation (RAG) pipeline to validate contextual consistency. Evaluations show SCORE achieves 23.6% higher coherence (NCI-2.0 benchmark), 89.7% emotional consistency (EASM metric), and 41.8% fewer hallucinations versus baseline GPT models. Its modular design supports incremental knowledge graph construction for persistent story memory and multi-LLM backend compatibility, offering an explainable solution for industrial-scale narrative systems requiring long-term consistency.
翻译:大型语言模型(LLM)在生成创意叙事方面表现出色,但在复杂故事中难以保持长期连贯性与情感一致性。为解决这一问题,我们提出SCORE(故事连贯性与检索增强)框架,该框架整合了三个核心组件:1)动态状态追踪(通过符号逻辑监控对象/角色状态),2)上下文感知摘要生成(构建分层情节摘要以刻画时序演进),3)混合检索机制(融合TF-IDF关键词相关性与基于余弦相似度的语义嵌入检索)。系统采用时序对齐的检索增强生成(RAG)流程来验证上下文一致性。评估结果表明,相较于基线GPT模型,SCORE在NCI-2.0基准测试中实现23.6%的连贯性提升,在EASM指标上达到89.7%的情感一致性,并减少41.8%的幻觉生成。其模块化设计支持增量式知识图谱构建以实现持久化故事记忆,同时兼容多LLM后端,为需要长期一致性的工业级叙事系统提供了可解释的解决方案。