Creative image generation has emerged as a compelling area of research, driven by the need to produce novel and high-quality images that expand the boundaries of imagination. In this work, we propose a novel framework for creative generation using diffusion models, where creativity is associated with the inverse probability of an image's existence in the CLIP embedding space. Unlike prior approaches that rely on a manual blending of concepts or exclusion of subcategories, our method calculates the probability distribution of generated images and drives it towards low-probability regions to produce rare, imaginative, and visually captivating outputs. We also introduce pullback mechanisms, achieving high creativity without sacrificing visual fidelity. Extensive experiments on text-to-image diffusion models demonstrate the effectiveness and efficiency of our creative generation framework, showcasing its ability to produce unique, novel, and thought-provoking images. This work provides a new perspective on creativity in generative models, offering a principled method to foster innovation in visual content synthesis.
翻译:创意图像生成已成为一个引人注目的研究领域,其驱动力在于需要生成新颖且高质量的图像,以拓展想象的边界。在本工作中,我们提出了一种利用扩散模型进行创意生成的新框架,其中创造力与图像在CLIP嵌入空间中存在的逆概率相关联。与先前依赖于手动概念融合或子类别排除的方法不同,我们的方法计算生成图像的概率分布,并将其推向低概率区域,以产生稀有、富有想象力且视觉上引人入胜的输出。我们还引入了回拉机制,在不牺牲视觉保真度的前提下实现高创造力。在文本到图像扩散模型上进行的大量实验证明了我们创意生成框架的有效性和效率,展示了其生成独特、新颖且发人深省的图像的能力。这项工作为生成模型中的创造力提供了新的视角,为促进视觉内容合成的创新提供了一种原则性方法。