A word-as-image is a semantic typography technique where a word illustration presents a visualization of the meaning of the word, while also preserving its readability. We present a method to create word-as-image illustrations automatically. This task is highly challenging as it requires semantic understanding of the word and a creative idea of where and how to depict these semantics in a visually pleasing and legible manner. We rely on the remarkable ability of recent large pretrained language-vision models to distill textual concepts visually. We target simple, concise, black-and-white designs that convey the semantics clearly. We deliberately do not change the color or texture of the letters and do not use embellishments. Our method optimizes the outline of each letter to convey the desired concept, guided by a pretrained Stable Diffusion model. We incorporate additional loss terms to ensure the legibility of the text and the preservation of the style of the font. We show high quality and engaging results on numerous examples and compare to alternative techniques.
翻译:词即图像是一种语义字体技术,通过文字插图直观呈现词语含义的同时保持文本可读性。本文提出一种自动生成词即图像插图的方法。该任务极具挑战性,需要理解词语的语义内涵,并以视觉悦目且可读的方式构思在何处及如何呈现这些语义。我们借助近期大型预训练语言-视觉模型从文本概念中提取视觉特征的卓越能力,聚焦于简洁清晰的黑白设计以明确传达语义。我们有意不改变字母的颜色或纹理,亦不使用装饰元素。该方法基于预训练稳定扩散模型(Stable Diffusion),通过优化每个字母的轮廓来传达所需概念,同时引入额外损失函数以确保文字可读性与字体风格保持。我们在多个实例中展示了高质量且引人入胜的效果,并与现有技术进行了对比。