While text-to-image (T2I) models have advanced considerably, their capability to associate colors with implicit concepts remains underexplored. To address the gap, we introduce ColorConceptBench, a new human-annotated benchmark to systematically evaluate color-concept associations through the lens of probabilistic color distributions. ColorConceptBench moves beyond explicit color names or codes by probing how models translate 1,281 implicit color concepts using a foundation of 6,369 human annotations. Our evaluation of seven leading T2I models reveals that current models lack sensitivity to abstract semantics, and crucially, this limitation appears resistant to standard interventions (e.g., scaling and guidance). This demonstrates that achieving human-like color semantics requires more than larger models, but demands a fundamental shift in how models learn and represent implicit meaning.
翻译:尽管文本到图像(T2I)模型已取得显著进展,但其将颜色与隐含概念相关联的能力仍未得到充分探索。为填补这一空白,我们提出了ColorConceptBench,这是一个新的人工标注基准,旨在通过概率性颜色分布的视角系统评估颜色-概念关联。ColorConceptBench超越了显式颜色名称或代码,通过基于6,369个人工标注的基础,探究模型如何转换1,281个隐含颜色概念。我们对七个主流T2I模型的评估表明,当前模型对抽象语义缺乏敏感性,且关键的是,这种局限性似乎对标准干预措施(例如模型缩放和引导)具有抵抗性。这表明,实现类人的颜色语义不仅需要更大的模型,更需要在模型学习和表示隐含意义的方式上进行根本性转变。