Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.
翻译:协作故事是由多位写作风格和意图各异的作者共同创作的文本,给自然语言处理模型带来了独特挑战。由于缺乏开放域语料库,对这类故事的理解与生成仍是一个研究尚不充分的领域。为此,我们提出了STORYWARS,一个来自在线平台、包含超过40,000篇由9,400位不同作者创作的协作故事的新数据集。我们在STORYWARS上设计了12种任务类型(含7种理解任务与5种生成任务),共衍生出101项多样化的故事相关任务,构建了一个涵盖全监督、少样本和零样本场景的多任务基准。此外,我们针对故事任务提出了指令微调模型INSTRUCTSTORY,研究表明指令微调不仅在零样本和少样本场景下取得了优异结果,在STORYWARS的全监督任务上也获得了最佳性能,从而在STORYWARS上建立了强大的多任务基准表现。