We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global style and shape of the font, while the vector model synthesizes vector fonts conditioned on the low-resolution raster fonts from the first stage. To synthesize long and complex curves, our vector diffusion model uses a transformer architecture and a novel vector representation that enables the modeling of diverse vector geometry and the precise prediction of control points. Our experiments show that, in contrast to previous generative models for vector graphics, our new cascaded vector diffusion model generates higher quality vector fonts, with complex structures and diverse styles.
翻译:本文提出VecFusion,一种能够生成具有不同拓扑结构和精确控制点位置的矢量字体的新型神经架构。我们的方法采用级联扩散模型,包含一个栅格扩散模型和一个矢量扩散模型。栅格模型生成带有辅助控制点信息的低分辨率栅格化字体,以捕捉字体的整体风格与形状;而矢量模型则以第一阶段生成的低分辨率栅格字体为条件,合成矢量字体。为合成长而复杂的曲线,我们的矢量扩散模型采用Transformer架构和一种新颖的矢量表示方法,能够对多样化的矢量几何结构进行建模并精确预测控制点。实验表明,与以往的矢量图形生成模型相比,我们提出的新型级联矢量扩散模型能够生成质量更高、结构更复杂且风格更多样的矢量字体。