Understanding human visual attention is key to preserving cultural heritage We introduce SPGen a novel deep learning model to predict scanpaths the sequence of eye movementswhen viewers observe paintings. Our architecture uses a Fully Convolutional Neural Network FCNN with differentiable fixation selection and learnable Gaussian priors to simulate natural viewing biases To address the domain gap between photographs and artworks we employ unsupervised domain adaptation via a gradient reversal layer allowing the model to transfer knowledge from natural scenes to paintings Furthermore a random noise sampler models the inherent stochasticity of eyetracking data. Extensive testing shows SPGen outperforms existing methods offering a powerful tool to analyze gaze behavior and advance the preservation and appreciation of artistic treasures.
翻译:理解人类视觉注意机制是保护文化遗产的关键。本文提出SPGen,一种新颖的深度学习模型,用于预测观察者观看绘画时的扫描路径(即眼动序列)。该架构采用全卷积神经网络(FCNN),结合可微注视点选择机制与可学习高斯先验,以模拟自然观看过程中的认知偏差。针对摄影作品与艺术绘画之间的域差异问题,我们通过梯度反转层实现无监督域适应,使模型能够将自然场景知识迁移至绘画领域。此外,随机噪声采样器被用于建模眼动数据固有的随机性。大量实验表明,SPGen在性能上超越现有方法,为分析注视行为、推动艺术珍品的保护与鉴赏提供了有力工具。