Creative sketch is a universal way of visual expression, but translating images from an abstract sketch is very challenging. Traditionally, creating a deep learning model for sketch-to-image synthesis needs to overcome the distorted input sketch without visual details, and requires to collect large-scale sketch-image datasets. We first study this task by using diffusion models. Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately. Extensive experiments confirmed that our method can not only be faithful to user's input sketches, but also maintain the diversity and imagination of synthetic image results. Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets. Additionally, we present applications of our method in image editing and interpolation.
翻译:创意草图是视觉表达的通用方式,但从抽象草图生成图像极富挑战性。传统上,构建用于草图到图像合成的深度学习模型需要克服缺少视觉细节的扭曲输入草图,并需收集大规模草图-图像数据集。我们率先利用扩散模型研究该任务。模型通过跨域约束匹配草图,并利用分类器更精确地引导图像合成。大量实验证实,该方法不仅能忠实于用户输入的草图,还能保持合成图像结果的多样性与想象力。我们的模型在生成质量与人工评估方面优于基于GAN的方法,且无需依赖大规模草图-图像数据集。此外,我们展示了该方法在图像编辑与插值中的应用。