Seedance 2.0 is a new native multi-modal audio-video generation model, officially released in China in early February 2026. Compared with its predecessors, Seedance 1.0 and 1.5 Pro, Seedance 2.0 adopts a unified, highly efficient, and large-scale architecture for multi-modal audio-video joint generation. This allows it to support four input modalities: text, image, audio, and video, by integrating one of the most comprehensive suites of multi-modal content reference and editing capabilities available in the industry to date. It delivers substantial, well-rounded improvements across all key sub-dimensions of video and audio generation. In both expert evaluations and public user tests, the model has demonstrated performance on par with the leading levels in the field. Seedance 2.0 supports direct generation of audio-video content with durations ranging from 4 to 15 seconds, with native output resolutions of 480p and 720p. For multi-modal inputs as reference, its current open platform supports up to 3 video clips, 9 images, and 3 audio clips. In addition, we provide Seedance 2.0 Fast version, an accelerated variant of Seedance 2.0 designed to boost generation speed for low-latency scenarios. Seedance 2.0 has delivered significant improvements to its foundational generation capabilities and multi-modal generation performance, bringing an enhanced creative experience for end users.
翻译:Seedance 2.0 是一款全新的原生多模态音视频联合生成模型,于2026年2月初在中国正式发布。相较于前代模型 Seedance 1.0 和 1.5 Pro,Seedance 2.0 采用了统一、高效且大规模的多模态音视频联合生成架构。该架构通过集成业界迄今为止最全面的多模态内容参考与编辑能力套件之一,支持文本、图像、音频和视频四种输入模态。它在视频与音频生成的所有关键子维度上均实现了全面且显著的性能提升。在专家评估和公开用户测试中,该模型展现出的性能已达到与领域领先水平相当的水准。Seedance 2.0 支持直接生成时长4至15秒的音视频内容,原生输出分辨率包含480p和720p。针对多模态输入作为参考的场景,其当前开放平台支持最多3个视频片段、9张图像和3个音频片段。此外,我们还提供了Seedance 2.0 Fast版本,这是Seedance 2.0的加速变体,旨在提升低延迟场景下的生成速度。Seedance 2.0 在基础生成能力和多模态生成性能方面均取得了显著进步,为终端用户带来了更增强的创作体验。