Human attention modelling has proven, in recent years, to be particularly useful not only for understanding the cognitive processes underlying visual exploration, but also for providing support to artificial intelligence models that aim to solve problems in various domains, including image and video processing, vision-and-language applications, and language modelling. This survey offers a reasoned overview of recent efforts to integrate human attention mechanisms into contemporary deep learning models and discusses future research directions and challenges. For a comprehensive overview on the ongoing research refer to our dedicated repository available at https://github.com/aimagelab/awesome-human-visual-attention.
翻译:近年来,人类注意力建模已被证明不仅有助于理解视觉探索背后的认知过程,还能为旨在解决图像视频处理、视觉-语言应用及语言建模等多个领域问题的人工智能模型提供支持。本综述系统梳理了将人类注意力机制融入当代深度学习模型的最新研究成果,并探讨了未来研究方向与挑战。关于当前研究的全面综述,请参阅我们维护的专题资源库(https://github.com/aimagelab/awesome-human-visual-attention)。