PlotTwist: A Creative Plot Generation Framework with Small Language Models

Creative plot generation presents a fundamental challenge for language models: transforming a concise premise into a coherent narrative that sustains global structure, character development, and emotional resonance. Although recent Large Language Models (LLMs) demonstrate strong fluency across general-purpose tasks, they typically require preference alignment to perform well on specialized domains such as creative plot generation. However, conducting such alignment at the scale of frontier LLMs is computationally prohibitive, significantly limiting accessibility and practical deployment. To address this, we present PlotTwist, a structured framework that enables Small Language Models (SLMs) with $\leq$ 5B active parameters to generate high-quality, premise-conditioned plots competitive with frontier systems up to $200\times$ larger. Our approach decomposes generation into three specialized components: (1) an Aspect Rating Reward Model trained via a novel Positive-Negative prompting strategy to deliver structured narratives across five Narrative Quality Dimensions (NQDs); (2) a Mixture-of-Experts (MoE) plot generator aligned via Direct Preference Optimization on high-confidence preference pairs; and (3) an Agentic Evaluation module that emulates human critical judgment for unbiased post-hoc assessment. Extensive experiments demonstrate that PlotTwist consistently outperforms frontier models across multiple NQDs despite substantially tighter capacity constraints. Further validation confirms strong sensitivity to narrative quality, as the framework reliably distinguishes plots derived from critically acclaimed versus widely panned screenplays. Together, these results establish structured, preference-based alignment as a resource-efficient approach to high-quality creative plot generation.

翻译：创造性情节生成为语言模型提出了一个根本性挑战：如何将简洁的前提转化为具有全局结构、角色发展和情感共鸣的连贯叙事。尽管近期的大型语言模型在通用任务上展现出强大的流畅性，但它们通常需要偏好对齐才能在诸如创造性情节生成等专业领域表现良好。然而，在尖端LLM的规模上进行此类对齐计算成本极高，极大地限制了可访问性和实际部署。为解决此问题，我们提出了PlotTwist，这是一个结构化框架，使活跃参数≤5B的小型语言模型能够生成与规模高达其200倍的尖端系统相竞争的高质量、前提条件化情节。我们的方法将生成分解为三个专门化组件：(1) 通过新颖的正负提示策略训练的方面评分奖励模型，用于在五个叙事质量维度上提供结构化叙事；(2) 通过直接偏好优化在高置信度偏好对上对齐的混合专家情节生成器；(3) 模拟人类批判性判断以进行无偏事后评估的智能体评估模块。大量实验表明，尽管面临显著更严格的容量限制，PlotTwist在多个NQD上始终优于尖端模型。进一步的验证证实了其对叙事质量的强敏感性，该框架能可靠地区分源自广受好评与普遍恶评剧本的情节。综上所述，这些结果确立了基于结构的偏好对齐作为一种资源高效的高质量创造性情节生成方法。