Despite tremendous improvements in tasks such as image classification, object detection, and segmentation, the recognition of visual relationships, commonly modeled as the extraction of a graph from an image, remains a challenging task. We believe that this mainly stems from the fact that there is no canonical way to approach the visual graph recognition task. Most existing solutions are specific to a problem and cannot be transferred between different contexts out-of-the box, even though the conceptual problem remains the same. With broad applicability and simplicity in mind, in this paper we develop a method, \textbf{Gra}ph Recognition via \textbf{S}ubgraph \textbf{P}rediction (\textbf{GraSP}), for recognizing graphs in images. We show across several synthetic benchmarks and one real-world application that our method works with a set of diverse types of graphs and their drawings, and can be transferred between tasks without task-specific modifications, paving the way to a more unified framework for visual graph recognition.
翻译:尽管在图像分类、目标检测和分割等任务上取得了巨大进展,但从图像中提取图结构来建模的视觉关系识别仍然是一项具有挑战性的任务。我们认为这主要源于视觉图识别任务缺乏规范化的处理方法。现有解决方案大多针对特定问题设计,即使概念问题本质相同,也无法在不同情境下直接迁移应用。基于广泛适用性与简洁性的考量,本文提出了一种通过子图预测进行图识别的方法(\textbf{GraSP})。我们在多个合成基准测试和一个实际应用中证明,该方法适用于多种类型的图及其图示,且无需任务特定修改即可在不同任务间迁移,为构建更统一的视觉图识别框架奠定了基础。