Few-shot font generation simultaneously requires global structural completeness and fine-grained local style fidelity. Existing methods usually either rely on global content-style modeling, which is robust but imperfectly disentangled, or emphasize component/local modeling, which captures fine details but relies heavily on local priors and reference coverage. We argue that the key challenge is not merely to learn purer conditions, but to organize complementary yet biased global and local conditions through multi-level allocation during generation. To this end, we propose SmartFont, a diffusion-based few-shot font generation framework that combines global content-style generation with weakly supervised local corrective experts. The local branch performs semantic-spatial allocation by learning expert-wise local concepts and semantically meaningful spatial maps under weak component supervision, enabling fine-grained correction without requiring explicit component-conditioned inference. On top of this, a denoising-state condition allocation module adaptively weights global content, global style, and local corrective feature across timesteps and injection blocks. Extensive experiments show that SmartFont achieves better global-local balance, improves glyph quality and local detail fidelity.
翻译:少样本字体生成需要同时保证全局结构的完整性与细粒度局部风格的保真度。现有方法通常依赖于全局内容-风格建模(虽具鲁棒性但存在解耦不完善问题),或侧重组件/局部建模(虽能捕捉精细细节但过度依赖局部先验与参考覆盖)。我们认为关键挑战不在于单纯学习更纯净的条件,而在于通过生成过程中的多层级分配来组织具有互补性但存在偏置的全局与局部条件。为此,我们提出SmartFont——一种基于扩散的少样本字体生成框架,融合了全局内容-风格生成与弱监督局部修正专家模块。局部分支通过弱组件监督学习专家级局部概念及语义空间映射,实现语义-空间分配,在无需显式组件条件推理的情况下完成细粒度修正。进一步地,去噪状态条件分配模块自适应地跨时间步与注入模块对全局内容、全局风格及局部修正特征进行加权。大量实验表明,SmartFont在全局-局部平衡性、字形质量与局部细节保真度方面均取得更优表现。