Visual illusions play a significant role in understanding visual perception. Current methods in understanding and evaluating visual illusions are mostly deterministic filtering based approach and they evaluate on a handful of visual illusions, and the conclusions therefore, are not generic. To this end, we generate a large-scale dataset of 22,366 images (BRI3L: BRightness Illusion Image dataset for Identification and Localization of illusory perception) of the five types of brightness illusions and benchmark the dataset using data-driven neural network based approaches. The dataset contains label information - (1) whether a particular image is illusory/nonillusory, (2) the segmentation mask of the illusory region of the image. Hence, both the classification and segmentation task can be evaluated using this dataset. We follow the standard psychophysical experiments involving human subjects to validate the dataset. To the best of our knowledge, this is the first attempt to develop a dataset of visual illusions and benchmark using data-driven approach for illusion classification and localization. We consider five well-studied types of brightness illusions: 1) Hermann grid, 2) Simultaneous Brightness Contrast, 3) White illusion, 4) Grid illusion, and 5) Induced Grating illusion. Benchmarking on the dataset achieves 99.56% accuracy in illusion identification and 84.37% pixel accuracy in illusion localization. The application of deep learning model, it is shown, also generalizes over unseen brightness illusions like brightness assimilation to contrast transitions. We also test the ability of state-of-theart diffusion models to generate brightness illusions. We have provided all the code, dataset, instructions etc in the github repo: https://github.com/aniket004/BRI3L
翻译:视觉错觉在理解视觉感知中具有重要作用。当前理解和评估视觉错觉的方法多基于确定性滤波,且仅在少量视觉错觉样本上进行评估,因此结论缺乏普适性。为此,我们构建了一个包含22,366张图像的大规模数据集(BRI3L:面向幻觉感知识别与定位的亮度错觉图像数据集),涵盖五种亮度错觉类型,并采用数据驱动的神经网络方法对该数据集进行基准测试。数据集包含两种标签信息:(1)图像是否为幻觉/非幻觉;(2)图像中幻觉区域的分割掩码。因此,该数据集可用于分类与分割两种任务的评估。我们遵循标准心理物理学实验流程,通过人类受试者对数据集进行验证。据我们所知,这是首次尝试构建视觉错觉数据集并采用数据驱动方法进行幻觉分类与定位的基准测试。我们选取了五种研究充分的亮度错觉类型:1)赫尔曼网格、2)同时亮度对比、3)怀特错觉、4)网格错觉、5)诱导光栅错觉。在该数据集上的基准测试结果表明:幻觉识别准确率达99.56%,幻觉定位像素准确率达84.37%。研究表明,深度学习模型还能泛化至未见过的亮度错觉(如亮度同化向对比的转换)。此外,我们还测试了当前最先进的扩散模型生成亮度错觉的能力。所有代码、数据集及说明文档均已在GitHub仓库中提供:https://github.com/aniket004/BRI3L