Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitating an explosion of use cases with generative AI. However, such models often fail to connect the generated outputs and desired target concepts/preferences implied by the prompts. Current research addressing this limitation has largely focused on enhancing the prompts before output generation or improving the model's performance up front. In contrast, this paper conceptualizes prompt evolution, imparting evolutionary selection pressure and variation during the generative process to produce multiple outputs that satisfy the target concepts/preferences better. We propose a multi-objective instantiation of this broader idea that uses a multi-label image classifier-guided approach. The predicted labels from the classifiers serve as multiple objectives to optimize, with the aim of producing diversified images that meet user preferences. A novelty of our evolutionary algorithm is that the pre-trained generative model gives us implicit mutation operations, leveraging the model's stochastic generative capability to automate the creation of Pareto-optimized images more faithful to user preferences.
翻译:基于用户提示合成数字产物已成为促进生成式AI应用爆发式增长的重要范式。然而,此类模型常常难以将生成输出与提示所隐含的目标概念/用户偏好有效关联。当前应对这一局限的研究主要聚焦于输出生成前的提示优化或模型性能的预先提升。与之不同,本文将提示进化概念化,通过在生成过程中施加进化选择压力与变异,以产生更符合目标概念/偏好的多样化输出。我们提出该理念的多目标实现方案,采用多标签图像分类器引导方法。分类器预测的标签作为多个优化目标,旨在生成满足用户偏好的多样化图像。本进化算法的创新之处在于:预训练生成模型为我们提供隐式变异操作,利用模型自身的随机生成能力自动创建更忠实于用户偏好的帕累托优化图像。