Visual illusions in humans arise when interpreting out-of-distribution stimuli: if the observer is adapted to certain statistics, perception of outliers deviates from reality. Recent studies have shown that artificial neural networks (ANNs) can also be deceived by visual illusions. This revelation raises profound questions about the nature of visual information. Why are two independent systems, both human brains and ANNs, susceptible to the same illusions? Should any ANN be capable of perceiving visual illusions? Are these perceptions a feature or a flaw? In this work, we study how visual illusions are encoded in diffusion models. Remarkably, we show that they present human-like brightness/color shifts in their latent space. We use this fact to demonstrate that diffusion models can predict visual illusions. Furthermore, we also show how to generate new unseen visual illusions in realistic images using text-to-image diffusion models. We validate this ability through psychophysical experiments that show how our model-generated illusions also fool humans.
翻译:人类视觉错觉产生于对分布外刺激的解读:当观察者适应特定统计规律时,对异常值的感知会偏离现实。近期研究表明,人工神经网络同样可能被视觉错觉所欺骗。这一发现引发了关于视觉信息本质的深刻问题:为何人类大脑与人工神经网络这两个独立系统会对相同错觉产生敏感反应?是否所有人工神经网络都应具备感知视觉错觉的能力?这种感知究竟是系统特性还是缺陷?本研究探索了扩散模型中视觉错觉的编码机制。值得注意的是,我们发现其在潜在空间中呈现出类人化的亮度/色彩偏移现象。基于此发现,我们证实扩散模型能够预测视觉错觉。此外,我们还展示了如何利用文生图扩散模型在真实感图像中生成前所未见的视觉错觉。通过心理物理学实验验证,我们生成的错觉图像同样能够欺骗人类观察者。