Image retrieval is a fundamental task in computer vision. Despite recent advances in this field, many techniques have been evaluated on a limited number of domains, with a small number of instance categories. Notably, most existing works only consider domains like 3D landmarks, making it difficult to generalize the conclusions made by these works to other domains, e.g., logo and other 2D flat objects. To bridge this gap, we introduce a new dataset for benchmarking visual search methods on flat images with diverse patterns. Our flat object retrieval benchmark (FORB) supplements the commonly adopted 3D object domain, and more importantly, it serves as a testbed for assessing the image embedding quality on out-of-distribution domains. In this benchmark we investigate the retrieval accuracy of representative methods in terms of candidate ranks, as well as matching score margin, a viewpoint which is largely ignored by many works. Our experiments not only highlight the challenges and rich heterogeneity of FORB, but also reveal the hidden properties of different retrieval strategies. The proposed benchmark is a growing project and we expect to expand in both quantity and variety of objects. The dataset and supporting codes are available at https://github.com/pxiangwu/FORB/.
翻译:图像检索是计算机视觉中的一项基础任务。尽管该领域近期取得了进展,但许多技术仅在有限数量的领域和少量实例类别上进行了评估。值得注意的是,现有工作大多只考虑三维地标等域,这使得这些工作的结论难以推广至其他领域(例如,标志及其他二维平面物体)。为填补这一空白,我们引入了一个新数据集,用于在具有多样图案的平面图像上对视觉搜索方法进行基准测试。我们的平面物体检索基准(FORB)补充了通常采用的三维物体域,更重要的是,它充当了评估图像嵌入质量在分布外域上的测试平台。在此基准中,我们考察了代表性方法在候选排名及匹配得分裕度(许多研究在很大程度上忽略的视角)方面的检索精度。我们的实验不仅凸显了FORB的挑战性与丰富异质性,还揭示了不同检索策略的隐藏特性。所提出的基准是一个持续发展的项目,我们预计其将在物体数量与多样性上进一步扩展。该数据集及配套代码可在https://github.com/pxiangwu/FORB/获取。