AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era

Yudong Jiang,Baohan Xu,Siqian Yang,Mingyu Yin,Jing Liu,Chao Xu,Siqi Wang,Yidi Wu,Bingwen Zhu,Xinwen Zhang,Xingyu Zheng,Jixuan Xu,Yue Zhang,Jinlong Hou,Huyang Sun

Animation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, AniSora, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation. Our evaluation benchmark will be publicly available at https://github.com/bilibili/Index-anisora.

翻译：近年来，动画在影视行业中获得了显著关注。尽管Sora、Kling和CogVideoX等先进视频生成模型在生成自然视频方面取得了成功，但它们在处理动画视频方面却缺乏同等效能。由于动画独特的艺术风格、对物理定律的违背以及夸张的动作，评估动画视频生成也面临巨大挑战。本文提出了一个用于动画视频生成的综合系统AniSora，该系统包括数据处理流水线、可控生成模型和评估数据集。在拥有超过1000万高质量数据的数据处理流水线支持下，生成模型引入了时空掩码模块，以支持关键动画制作功能，如图像到视频生成、帧插值和局部图像引导动画。我们还收集了一个包含948个多样化动画视频的评估基准，在VBench和人类双盲测试上的评估表明其在角色和运动方面具有一致性，在动画视频生成领域取得了最先进的成果。我们的评估基准将在https://github.com/bilibili/Index-anisora 公开提供。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日