Trailers are short promotional videos designed to provide audiences with a glimpse of a movie. The process of creating a trailer typically involves selecting key scenes, dialogues and action sequences from the main content and editing them together in a way that effectively conveys the tone, theme and overall appeal of the movie. This often includes adding music, sound effects, visual effects and text overlays to enhance the impact of the trailer. In this paper, we present a framework exploiting a comprehensive multimodal strategy for automated trailer production. Also, a Large Language Model (LLM) is adopted across various stages of the trailer creation. First, it selects main key visual sequences that are relevant to the movie's core narrative. Then, it extracts the most appealing quotes from the movie, aligning them with the trailer's narrative. Additionally, the LLM assists in creating music backgrounds and voiceovers to enrich the audience's engagement, thus contributing to make a trailer not just a summary of the movie's content but a narrative experience in itself. Results show that our framework generates trailers that are more visually appealing to viewers compared to those produced by previous state-of-the-art competitors.
翻译:预告片是为观众提供电影片段预览的短篇宣传视频。制作预告片的过程通常涉及从主要内容中选取关键场景、对话和动作序列,并通过剪辑将其组合,以有效传达电影的基调、主题和整体吸引力。这通常包括添加音乐、音效、视觉特效和文字叠加,以增强预告片的感染力。本文提出一个利用综合多模态策略实现自动预告片生成的框架。同时,在预告片制作的各个阶段均采用了大型语言模型(LLM)。首先,该模型选取与电影核心叙事相关的关键视觉序列;随后,从电影中提取最具吸引力的台词,使其与预告片的叙事主线保持一致。此外,LLM还协助生成背景音乐和旁白,以增强观众的沉浸感,从而使预告片不仅成为电影内容的摘要,更成为一种独立的叙事体验。实验结果表明,与先前最先进的竞争方法相比,本框架生成的预告片在视觉上对观众更具吸引力。