We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global style and shape of the font, while the vector model synthesizes vector fonts conditioned on the low-resolution raster fonts from the first stage. To synthesize long and complex curves, our vector diffusion model uses a transformer architecture and a novel vector representation that enables the modeling of diverse vector geometry and the precise prediction of control points. Our experiments show that, in contrast to previous generative models for vector graphics, our new cascaded vector diffusion model generates higher quality vector fonts, with complex structures and diverse styles.
翻译:我们提出VecFusion,一种可生成具有可变拓扑结构及精准控制点位置的矢量字体新型神经架构。该方法采用级联式扩散模型,由光栅扩散模型与矢量扩散模型先后构成。光栅模型通过辅助控制点信息生成低分辨率光栅化字体,捕捉字体的全局风格与形状;矢量模型则基于第一阶段输出的低分辨率光栅字体合成矢量字体。为生成长而复杂的曲线,矢量扩散模型采用Transformer架构及新型矢量表征,该表征支持多样化矢量几何建模与精确控制点预测。实验表明,与既有矢量图形生成模型相比,本级联式矢量扩散模型可生成结构复杂、风格多样的高质量矢量字体。