Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitating an explosion of use cases with generative AI. However, such models often fail to connect the generated outputs and desired target concepts/preferences implied by the prompts. Current research addressing this limitation has largely focused on enhancing the prompts before output generation or improving the model's performance up front. In contrast, this paper conceptualizes prompt evolution, imparting evolutionary selection pressure and variation during the generative process to produce multiple outputs that satisfy the target concepts/preferences better. We propose a multi-objective instantiation of this broader idea that uses a multi-label image classifier-guided approach. The predicted labels from the classifiers serve as multiple objectives to optimize, with the aim of producing diversified images that meet user preferences. A novelty of our evolutionary algorithm is that the pre-trained generative model gives us implicit mutation operations, leveraging the model's stochastic generative capability to automate the creation of Pareto-optimized images more faithful to user preferences.
翻译:基于用户提示进行数字制品合成已成为促进生成式人工智能应用爆发的关键范式。然而,此类模型常难以将生成输出与提示所隐含的目标概念/偏好有效关联。当前应对该局限的研究主要聚焦于输出生成前的提示增强,或预先提升模型性能。与此不同,本文提出"提示演化"概念,在生成过程中引入进化选择压力与变异机制,以产生更符合目标概念/偏好的多样化输出。我们为该宏观理念设计了一种多目标实例化方案,采用多标签图像分类器引导方法。分类器输出的预测标签作为待优化的多类目标,旨在生成满足用户偏好的多样化图像。本进化算法的创新性在于:预训练生成模型为我们提供了隐式变异算子,通过利用模型随机生成能力,自动创建更忠实于用户偏好的帕累托优化图像。