Predicting image memorability has attracted interest in various fields. Consequently, prediction accuracy with convolutional neural network (CNN) models has been approaching the empirical upper bound estimated based on human consistency. However, identifying which feature representations embedded in CNN models are responsible for such high prediction accuracy of memorability remains an open question. To tackle this problem, this study sought to identify memorability-related feature representations in CNN models using brain similarity. Specifically, memorability prediction accuracy and brain similarity were examined and assessed by Brain-Score across 16,860 layers in 64 CNN models pretrained for object recognition. A clear tendency was shown in this comprehensive analysis that layers with high memorability prediction accuracy had higher brain similarity with the inferior temporal (IT) cortex, which is the highest stage in the ventral visual pathway. Furthermore, fine-tuning the 64 CNN models revealed that brain similarity with the IT cortex at the penultimate layer was positively correlated with memorability prediction accuracy. This analysis also showed that the best fine-tuned model provided accuracy comparable to the state-of-the-art CNN models developed specifically for memorability prediction. Overall, this study's results indicated that the CNN models' great success in predicting memorability relies on feature representation acquisition similar to the IT cortex. This study advanced our understanding of feature representations and its use for predicting image memorability.
翻译:预测图像记忆性已引起多个领域的兴趣。因此,基于卷积神经网络(CNN)模型的预测准确度已接近根据人类一致性估计的经验上限。然而,识别CNN模型中哪些特征表示负责如此高的记忆性预测准确度仍是一个开放性问题。为解决这一问题,本研究试图利用大脑相似性来识别CNN模型中与记忆性相关的特征表示。具体而言,通过Brain-Score检查并评估了64个预训练用于物体识别的CNN模型中16,860个层的记忆性预测准确度和大脑相似性。这项综合分析显示出明显趋势:记忆性预测准确度高的层与下颞叶(IT)皮层(腹侧视觉通路中的最高阶段)具有更高的大脑相似性。此外,对64个CNN模型进行微调后发现,倒数第二层与IT皮层的大脑相似性与记忆性预测准确度呈正相关。该分析还表明,最佳微调模型提供的准确度与专门为记忆性预测开发的最新CNN模型相当。总体而言,本研究结果表明,CNN模型在预测记忆性方面的巨大成功依赖于与IT皮层相似的特征表示获取。本研究推进了我们对特征表示及其用于预测图像记忆性的理解。