With the rapid evolution of the Text-to-Image (T2I) model in recent years, their unsatisfactory generation result has become a challenge. However, uniformly refining AI-Generated Images (AIGIs) of different qualities not only limited optimization capabilities for low-quality AIGIs but also brought negative optimization to high-quality AIGIs. To address this issue, a quality-award refiner named Q-Refine is proposed. Based on the preference of the Human Visual System (HVS), Q-Refine uses the Image Quality Assessment (IQA) metric to guide the refining process for the first time, and modify images of different qualities through three adaptive pipelines. Experimental shows that for mainstream T2I models, Q-Refine can perform effective optimization to AIGIs of different qualities. It can be a general refiner to optimize AIGIs from both fidelity and aesthetic quality levels, thus expanding the application of the T2I generation models.
翻译:随着文本到图像(Text-to-Image, T2I)模型近年来的快速发展,其不尽如人意的生成结果已成为一项挑战。然而,对质量参差不齐的AI生成图像(AI-Generated Images, AIGIs)进行统一优化,不仅会限制对低质量AIGIs的优化能力,还会对高质量AIGIs带来负面优化。为解决这一问题,本文提出了一种名为Q-Refine的质量感知优化器。基于人类视觉系统(Human Visual System, HVS)的偏好,Q-Refine首次利用图像质量评估(Image Quality Assessment, IQA)指标引导优化过程,并通过三条自适应流水线对不同质量的图像进行修正。实验表明,对于主流的T2I模型,Q-Refine能够对不同质量的AIGIs执行有效优化。该模型可作为通用优化器,从保真度和美学质量两个层面优化AIGIs,从而扩展T2I生成模型的应用范围。