SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers

Scalable Vector Graphics (SVG) are essential XML-based formats for versatile graphics, offering resolution independence and scalability. Unlike raster images, SVGs use geometric shapes and support interactivity, animation, and manipulation via CSS and JavaScript. Current SVG generation methods face challenges related to high computational costs and complexity. In contrast, human designers use component-based tools for efficient SVG creation. Inspired by this, SVGBuilder introduces a component-based, autoregressive model for generating high-quality colored SVGs from textual input. It significantly reduces computational overhead and improves efficiency compared to traditional methods. Our model generates SVGs up to 604 times faster than optimization-based approaches. To address the limitations of existing SVG datasets and support our research, we introduce ColorSVG-100K, the first large-scale dataset of colored SVGs, comprising 100,000 graphics. This dataset fills the gap in color information for SVG generation models and enhances diversity in model training. Evaluation against state-of-the-art models demonstrates SVGBuilder's superior performance in practical applications, highlighting its efficiency and quality in generating complex SVG graphics.

翻译：可缩放矢量图形（SVG）是一种基于XML的通用图形格式，具备分辨率无关性和可扩展性。与光栅图像不同，SVG采用几何图形构建，并支持通过CSS和JavaScript实现交互、动画和操控。当前SVG生成方法面临计算成本高和复杂度大的挑战。相比之下，人类设计师采用基于组件的工具高效创建SVG。受此启发，SVGBuilder引入了一种基于组件的自回归模型，能够从文本输入生成高质量的彩色SVG。相较于传统方法，该模型显著降低了计算开销并提升了效率——其生成速度比基于优化的方法快604倍。为弥补现有SVG数据集的不足并支持本研究，我们推出了ColorSVG-100K，这是首个大规模彩色SVG数据集，包含10万个图形。该数据集填补了SVG生成模型在色彩信息方面的空白，并增强了模型训练的多样性。与现有最优模型的对比评估表明，SVGBuilder在实际应用中展现出卓越性能，在复杂SVG图形生成的效率与质量方面优势显著。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【书籍】从零开始构建文本生成图像生成器：基于 Transformers 与扩散模型

专知会员服务

25+阅读 · 2025年12月27日

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

专知会员服务

11+阅读 · 2025年7月5日

【CVPR2025】超图视觉Transformer：图像不仅仅是节点，也不仅仅是边

专知会员服务

13+阅读 · 2025年4月14日

基于文本的3D视觉定位综述：要素、最新进展与未来方向

专知会员服务

22+阅读 · 2024年6月17日