The recent progress in text-to-image models pretrained on large-scale datasets has enabled us to generate various images as long as we provide a text prompt describing what we want. Nevertheless, the availability of these models is still limited when we expect to generate images that fall into a specific domain either hard to describe or just unseen to the models. In this work, we propose DomainGallery, a few-shot domain-driven image generation method which aims at finetuning pretrained Stable Diffusion on few-shot target datasets in an attribute-centric manner. Specifically, DomainGallery features prior attribute erasure, attribute disentanglement, regularization and enhancement. These techniques are tailored to few-shot domain-driven generation in order to solve key issues that previous works have failed to settle. Extensive experiments are given to validate the superior performance of DomainGallery on a variety of domain-driven generation scenarios. Codes are available at https://github.com/Ldhlwh/DomainGallery.
翻译:在大规模数据集上预训练的文本到图像模型的最新进展使我们能够生成各种图像,只要我们提供描述所需内容的文本提示。然而,当我们期望生成属于特定领域(这些领域要么难以描述,要么模型未曾见过)的图像时,这些模型的可用性仍然有限。在本工作中,我们提出了DomainGallery,一种少样本领域驱动的图像生成方法,旨在以属性为中心的方式,在少样本目标数据集上微调预训练的Stable Diffusion模型。具体而言,DomainGallery具有先验属性擦除、属性解耦、正则化和增强等特性。这些技术专为少样本领域驱动生成而设计,以解决先前工作未能解决的关键问题。我们进行了大量实验,以验证DomainGallery在各种领域驱动生成场景下的卓越性能。代码可在 https://github.com/Ldhlwh/DomainGallery 获取。