This paper introduces a new data-driven, non-parametric method for image quality and aesthetics assessment, surpassing existing approaches and requiring no prompt engineering or fine-tuning. We eliminate the need for expressive textual embeddings by proposing efficient image anchors in the data. Through extensive evaluations of 7 state-of-the-art self-supervised models, our method demonstrates superior performance and robustness across various datasets and benchmarks. Notably, it achieves high agreement with human assessments even with limited data and shows high robustness to the nature of data and their pre-processing pipeline. Our contributions offer a streamlined solution for assessment of images while providing insights into the perception of visual information.
翻译:本文提出了一种新的数据驱动、非参数化方法,用于图像质量与美学评估,其性能超越了现有方法,且无需提示工程或微调。通过提出数据中的高效图像锚点,我们消除了对表达性文本嵌入的需求。基于对7种最先进自监督模型的广泛评估,我们的方法在各数据集和基准测试中展现出卓越的性能与鲁棒性。值得注意的是,即使在数据有限的情况下,该方法也能与人类评估结果保持高度一致性,并对数据性质及其预处理流程表现出极强的鲁棒性。我们的贡献不仅为图像评估提供了简化的解决方案,还揭示了对视觉信息感知的深刻见解。