With the rapid advancements of the text-to-image generative model, AI-generated images (AGIs) have been widely applied to entertainment, education, social media, etc. However, considering the large quality variance among different AGIs, there is an urgent need for quality models that are consistent with human subjective ratings. To address this issue, we extensively consider various popular AGI models, generated AGI through different prompts and model parameters, and collected subjective scores at the perceptual quality and text-to-image alignment, thus building the most comprehensive AGI subjective quality database AGIQA-3K so far. Furthermore, we conduct a benchmark experiment on this database to evaluate the consistency between the current Image Quality Assessment (IQA) model and human perception, while proposing StairReward that significantly improves the assessment performance of subjective text-to-image alignment. We believe that the fine-grained subjective scores in AGIQA-3K will inspire subsequent AGI quality models to fit human subjective perception mechanisms at both perception and alignment levels and to optimize the generation result of future AGI models. The database is released on https://github.com/lcysyzxdxc/AGIQA-3k-Database.
翻译:随着文本到图像生成模型的快速发展,AI生成图像已广泛应用于娱乐、教育、社交媒体等领域。然而,考虑到不同AGI之间存在显著的质量差异,亟需建立与人类主观评分一致的质量评估模型。为解决这一问题,本研究系统性地考虑了多种主流AGI模型,通过不同提示词和模型参数生成AGI图像,并在感知质量与文本-图像对齐两个维度收集主观评分,构建了目前最全面的AGI主观质量数据库AGIQA-3K。进一步,我们基于该数据库开展了基准实验,评估当前图像质量评估模型与人类感知的一致性,并提出StairReward方法显著提升了文本-图像对齐的主观评估性能。我们相信AGIQA-3K中细粒度的主观评分数据将启发后续AGI质量模型在感知与对齐两个层面拟合人类主观感知机制,并优化未来AGI模型的生成结果。数据库已发布于https://github.com/lcysyzxdxc/AGIQA-3k-Database。