General image-to-video generation methods often produce suboptimal animations that do not meet the requirements of animated graphics, as they lack active text motion and exhibit object distortion. Also, code-based animation generation methods typically require layer-structured vector data which are often not readily available for motion graphic generation. To address these challenges, we propose a novel framework named MG-Gen that reconstructs data in vector format from a single raster image to extend the capabilities of code-based methods to enable motion graphics generation from a raster image in the framework of general image-to-video generation. MG-Gen first decomposes the input image into layer-wise elements, reconstructs them as HTML format data and then generates executable JavaScript code for the reconstructed HTML data. We experimentally confirm that \ours{} generates motion graphics while preserving text readability and input consistency. These successful results indicate that combining layer decomposition and animation code generation is an effective strategy for motion graphics generation.
翻译:通用图像到视频生成方法通常会产生不符合动态图形要求的次优动画,因为它们缺乏主动的文本运动且存在对象失真。此外,基于代码的动画生成方法通常需要图层结构的矢量数据,而这些数据在动态图形生成中往往不易获得。为解决这些挑战,我们提出了一种名为MG-Gen的新型框架,该框架从单张光栅图像重建矢量格式数据,以扩展基于代码方法的能力,从而在通用图像到视频生成框架内实现从光栅图像生成动态图形。MG-Gen首先将输入图像分解为分层元素,将其重建为HTML格式数据,然后为重建的HTML数据生成可执行的JavaScript代码。我们通过实验证实,\ours{}在保持文本可读性和输入一致性的同时生成了动态图形。这些成功结果表明,结合图层分解与动画代码生成是动态图形生成的有效策略。