Storytelling is the lifeline of the entertainment industry -- movies, TV shows, and stand-up comedies, all need stories. A good and gripping script is the lifeline of storytelling and demands creativity and resource investment. Good scriptwriters are rare to find and often work under severe time pressure. Consequently, entertainment media are actively looking for automation. In this paper, we present an AI-based script-writing workbench called KUROSAWA which addresses the tasks of plot generation and script generation. Plot generation aims to generate a coherent and creative plot (600-800 words) given a prompt (15-40 words). Script generation, on the other hand, generates a scene (200-500 words) in a screenplay format from a brief description (15-40 words). Kurosawa needs data to train. We use a 4-act structure of storytelling to annotate the plot dataset manually. We create a dataset of 1000 manually annotated plots and their corresponding prompts/storylines and a gold-standard dataset of 1000 scenes with four main elements -- scene headings, action lines, dialogues, and character names -- tagged individually. We fine-tune GPT-3 with the above datasets to generate plots and scenes. These plots and scenes are first evaluated and then used by the scriptwriters of a large and famous media platform ErosNow. We release the annotated datasets and the models trained on these datasets as a working benchmark for automatic movie plot and script generation.
翻译:叙事是娱乐产业的生命线——电影、电视剧和脱口秀都离不开故事。一个好而引人入胜的剧本是叙事的核心,需要创造力和资源投入。优秀的编剧难得一见,且常常面临巨大的时间压力。因此,娱乐媒体正积极寻求自动化解决方案。本文介绍了一个名为KUROSAWA的基于AI的剧本创作平台,它应对情节生成和剧本生成两项任务。情节生成旨在根据给定提示(15-40个词)生成连贯且富有创意的情节(600-800个词)。而剧本生成则是根据简短描述(15-40个词)生成一场戏(200-500词)的剧本格式内容。Kurosawa需要数据进行训练。我们采用四幕结构叙事法对手动标注情节数据集。我们创建了一个包含1000个手动标注情节及其对应提示/故事线的数据集,以及一个包含1000场戏的黄金标准数据集,每场戏标注了四个主要元素——场景标题、动作描写、对话和角色名称。我们使用上述数据集对GPT-3进行微调,以生成情节和场景。这些情节和场景经过初步评估后,被大型知名媒体平台ErosNow的编剧使用。我们将标注数据集及基于这些数据训练的模型发布为自动电影情节与剧本生成的工作基准。