Personalization is critical for improving user experience in interactive writing and educational applications, yet remains understudied in story generation. We study the task of personalizing story generation, where our goal is to mimic an author's writing style, given other stories written by them. We collect Mythos, a dataset of 3.6k stories from 112 authors, with an average of 16 stories per author, across five distinct sources reflecting diverse story-writing settings. We propose a two-stage pipeline for personalized story generation: first, we infer authors' implicit writing characteristics and organize them into an Author Writing Sheet, which is validated by humans to be of high quality; second, we simulate the author's persona using tailored persona descriptions and personalized story rules. We find that stories personalized using the Author Writing Sheet outperform a non-personalized baseline, achieving a 78% win-rate in capturing authors' past style and 59% in similarity to ground-truth author stories. Human evaluation supports these findings and further highlights trends, such as Reddit stories being easier to personalize, and the Creativity and Language Use aspects of stories being easier to personalize than the Plot.
翻译:个性化对于提升交互式写作与教育应用中的用户体验至关重要,但在故事生成领域仍未得到充分研究。本文研究个性化故事生成任务,其目标是在给定作者所写的其他故事的情况下,模仿该作者的写作风格。我们收集了Mythos数据集,包含来自112位作者的3.6k个故事,平均每位作者贡献16个故事,数据来源于五个不同的渠道,反映了多样化的故事创作场景。我们提出一个两阶段的个性化故事生成流程:首先,推断作者的隐含写作特征并将其组织成作者写作档案,经人工验证该档案具有高质量;其次,通过定制的角色描述和个性化故事规则来模拟作者的人物设定。研究发现,使用作者写作档案进行个性化的故事在捕捉作者过往风格方面优于非个性化基线,胜率达78%,在与真实作者故事的相似度上达到59%。人工评估支持了这些发现,并进一步揭示了若干趋势,例如Reddit故事更易实现个性化,且故事在创意与语言运用方面比情节更易于个性化。