Existing narrative extraction methods face a trade-off between coherence, interactivity, and multi-storyline support. Narrative Maps supports rich interaction and generates multiple storylines as a byproduct of its coverage constraints, though this comes at the cost of individual path coherence. Narrative Trails achieves high coherence through maximum capacity path optimization but provides no mechanism for user guidance or multiple perspectives. We introduce agenda-based narrative extraction, a method that bridges this gap by integrating large language models into the Narrative Trails pathfinding process to steer storyline construction toward user-specified perspectives. Our approach uses an LLM at each step to rank candidate documents based on their alignment with a given agenda while maintaining narrative coherence. Running the algorithm with different agendas yields different storylines through the same corpus. We evaluated our approach on a news article corpus using LLM judges with Claude Opus 4.5 and GPT 5.1, measuring both coherence and agenda alignment across 64 endpoint pairs and 6 agendas. LLM-driven steering achieves 9.9% higher alignment than keyword matching on semantic agendas (p=0.017), with 13.3% improvement on \textit{Regime Crackdown} specifically (p=0.037), while keyword matching remains competitive on agendas with literal keyword overlap. The coherence cost is minimal: LLM steering reduces coherence by only 2.2% compared to the agenda-agnostic baseline. Counter-agendas that contradict the source material score uniformly low (2.2-2.5) across all methods, confirming that steering cannot fabricate unsupported narratives.
翻译:现有叙事提取方法在连贯性、交互性和多故事线支持之间存在权衡。叙事地图支持丰富的交互并生成多条故事线作为其覆盖约束的副产品,但这是以牺牲单条路径的连贯性为代价的。叙事轨迹通过最大容量路径优化实现高连贯性,但缺乏用户引导或多视角机制。我们提出了基于议程的叙事提取方法,通过将大语言模型集成到叙事轨迹的寻径过程中,将故事线构建引导至用户指定的视角,从而弥合了这一差距。我们的方法在每一步中使用大语言模型,根据候选文档与给定议程的对齐程度对其进行排序,同时保持叙事连贯性。使用不同议程运行该算法,可在同一语料库中生成不同的故事线。我们在一个新闻文章语料库上,使用Claude Opus 4.5和GPT 5.1作为大语言模型评估器,对64个端点对和6个议程测量了连贯性和议程对齐度。在语义议程上,大语言模型驱动的引导比关键词匹配实现了9.9%更高的对齐度(p=0.017),其中在《政权镇压》议程上提升13.3%(p=0.037),而关键词匹配在具有文字关键词重叠的议程上仍具有竞争力。连贯性代价极小:与无议程基线相比,大语言模型引导仅使连贯性降低2.2%。在所有方法中,与源材料相矛盾的反议程得分均一致较低(2.2-2.5),证实了引导无法伪造无依据的叙事。