The field of Computer Vision (CV) is increasingly shifting towards ``high-level'' visual sensemaking tasks, yet the exact nature of these tasks remains unclear and tacit. This survey paper addresses this ambiguity by systematically reviewing research on high-level visual understanding, focusing particularly on Abstract Concepts (ACs) in automatic image classification. Our survey contributes in three main ways: Firstly, it clarifies the tacit understanding of high-level semantics in CV through a multidisciplinary analysis, and categorization into distinct clusters, including commonsense, emotional, aesthetic, and inductive interpretative semantics. Secondly, it identifies and categorizes computer vision tasks associated with high-level visual sensemaking, offering insights into the diverse research areas within this domain. Lastly, it examines how abstract concepts such as values and ideologies are handled in CV, revealing challenges and opportunities in AC-based image classification. Notably, our survey of AC image classification tasks highlights persistent challenges, such as the limited efficacy of massive datasets and the importance of integrating supplementary information and mid-level features. We emphasize the growing relevance of hybrid AI systems in addressing the multifaceted nature of AC image classification tasks. Overall, this survey enhances our understanding of high-level visual reasoning in CV and lays the groundwork for future research endeavors.
翻译:计算机视觉(CV)领域正日益转向“高层”视觉意义建构任务,然而这些任务的确切性质仍不明确且隐晦。本综述论文通过系统梳理高层视觉理解的相关研究,特别是聚焦于自动图像分类中的抽象概念(ACs),来应对这一模糊性。我们的综述主要在三个方面有所贡献:首先,通过多学科分析,将高层语义的隐性理解进行明确化,并将其归类为不同的簇,包括常识语义、情感语义、美学语义和归纳解释语义。其次,识别并分类了与高层视觉意义建构相关的计算机视觉任务,提供了对该领域内多样化研究领域的见解。最后,探讨了诸如价值观和意识形态等抽象概念在CV中如何处理,揭示了基于AC的图像分类所面临的挑战与机遇。值得注意的是,我们对AC图像分类任务的综述突出了持续存在的挑战,例如大规模数据集效果有限以及整合补充信息与中层特征的重要性。我们强调混合AI系统在应对AC图像分类任务多面性方面日益增长的相关性。总体而言,本综述增进了我们对CV中高层视觉推理的理解,并为未来的研究工作奠定了基础。