Character description generation is an important capability for narrative-focused applications such as summarization, story analysis, and character-driven simulations. However, generating accurate character descriptions from long-form narratives (e.g., novels) is challenging: models must track evolving attributes (e.g., relationships and events), integrate evidence scattered across the text, and infer implicit details. Despite the success of reasoning-enabled LLMs on many benchmarks, we find that for character description generation their performance improves when built-in reasoning is disabled (i.e., an empty reasoning trace). Motivated by this, we propose a training framework that decouples reasoning from generation. Our approach, which can be applied on top of long-context LLMs or chunk-based methods, consists of a reasoning model that produces a structured QA reasoning trace and a generation model that conditions on this trace to produce the final character description. Experiments on two datasets (BookWorm and CroSS) show that QA-guided reasoning improves faithfulness, informativeness, and grounding over strong long-context baselines.
翻译:角色描述生成是面向摘要、故事分析及角色驱动模拟等叙事类应用的关键能力。然而,从长篇叙事文本(如小说)中生成准确的角色描述面临挑战:模型需追踪不断变化的属性(如人际关系与事件)、整合散布于文本中的证据,并推断隐含细节。尽管具备推理能力的大语言模型在众多基准测试中表现优异,但我们发现,在角色描述生成任务中,禁用内置推理机制(即生成空推理链)反而能提升其性能。基于此,我们提出一种将推理与生成解耦的训练框架。该方法可应用于长上下文大语言模型或基于分块的方法之上,包含一个生成结构化问答推理链的推理模型,以及一个基于该推理链生成最终角色描述的生成模型。在两个数据集(BookWorm和CroSS)上的实验表明,相较于强长上下文基线模型,基于问答引导的推理显著提升了角色描述的忠实性、信息量及依据性。