Despite the advances made in visual object recognition, state-of-the-art deep learning models struggle to effectively recognize novel objects in a few-shot setting where only a limited number of examples are provided. Unlike humans who excel at such tasks, these models often fail to leverage known relationships between entities in order to draw conclusions about such objects. In this work, we show that incorporating a symbolic knowledge graph into a state-of-the-art recognition model enables a new approach for effective few-shot classification. In our proposed neuro-symbolic architecture and training methodology, the knowledge graph is augmented with additional relationships extracted from a small set of examples, improving its ability to recognize novel objects by considering the presence of interconnected entities. Unlike existing few-shot classifiers, we show that this enables our model to incorporate not only objects but also abstract concepts and affordances. The existence of the knowledge graph also makes this approach amenable to interpretability through analysis of the relationships contained within it. We empirically show that our approach outperforms current state-of-the-art few-shot multi-label classification methods on the COCO dataset and evaluate the addition of abstract concepts and affordances on the Visual Genome dataset.
翻译:尽管视觉目标识别取得了进展,最先进的深度学习模型在仅有有限样本的少样本设定下,仍难以有效识别新物体。与擅长此类任务的人类不同,这些模型通常无法利用已知的实体间关系来推断目标属性。本研究证明,将符号知识图谱融入最先进的识别模型,能够为高效少样本分类开辟新途径。在我们提出的神经符号架构与训练方法论中,知识图谱通过从小样本集中提取的额外关系进行增强,通过考虑相互关联实体的存在性提升新物体识别能力。与现有少样本分类器不同,我们证明该模型不仅能整合目标,还能整合抽象概念与可供性。知识图谱的存在还使得该方法可通过分析其中蕴含的关系实现可解释性。实验表明,我们的方法在COCO数据集上超越当前最先进的少样本多标签分类方法,并在Visual Genome数据集上评估了抽象概念与可供性的引入效果。