Assessing the artness of AI-generated images continues to be a challenge within the realm of image generation. Most existing metrics cannot be used to perform instance-level and reference-free artness evaluation. This paper presents ArtScore, a metric designed to evaluate the degree to which an image resembles authentic artworks by artists (or conversely photographs), thereby offering a novel approach to artness assessment. We first blend pre-trained models for photo and artwork generation, resulting in a series of mixed models. Subsequently, we utilize these mixed models to generate images exhibiting varying degrees of artness with pseudo-annotations. Each photorealistic image has a corresponding artistic counterpart and a series of interpolated images that range from realistic to artistic. This dataset is then employed to train a neural network that learns to estimate quantized artness levels of arbitrary images. Extensive experiments reveal that the artness levels predicted by ArtScore align more closely with human artistic evaluation than existing evaluation metrics, such as Gram loss and ArtFID.
翻译:评估AI生成图像的艺术性(artness)仍是图像生成领域的一项挑战。现有大多数指标无法用于实例级且无参考的艺术性评估。本文提出ArtScore,一种旨在评估图像与艺术家创作的真实艺术品(或反之与照片)相似程度的指标,从而为艺术性评估提供了一种新方法。我们首先融合用于照片和艺术品生成的预训练模型,得到一系列混合模型。随后,利用这些混合模型生成具有不同艺术性程度及伪标注的图像。每张逼真图像都对应一个艺术性版本及一系列从逼真到艺术风格的插值图像。该数据集被用于训练神经网络,使其学会估计任意图像的量化艺术性等级。大量实验表明,与现有评估指标(如Gram损失和ArtFID)相比,ArtScore预测的艺术性等级与人类艺术评估的一致性更高。