GREC: Generalized Referring Expression Comprehension

The objective of Classic Referring Expression Comprehension (REC) is to produce a bounding box corresponding to the object mentioned in a given textual description. Commonly, existing datasets and techniques in classic REC are tailored for expressions that pertain to a single target, meaning a sole expression is linked to one specific object. Expressions that refer to multiple targets or involve no specific target have not been taken into account. This constraint hinders the practical applicability of REC. This study introduces a new benchmark termed as Generalized Referring Expression Comprehension (GREC). This benchmark extends the classic REC by permitting expressions to describe any number of target objects. To achieve this goal, we have built the first large-scale GREC dataset named gRefCOCO. This dataset encompasses a range of expressions: those referring to multiple targets, expressions with no specific target, and the single-target expressions. The design of GREC and gRefCOCO ensures smooth compatibility with classic REC. The proposed gRefCOCO dataset, a GREC method implementation code, and GREC evaluation code are available at https://github.com/henghuiding/gRefCOCO.

翻译：经典指代表达式理解（REC）的目标是根据给定文本描述生成对应物体的边界框。通常，经典REC中的现有数据集和技术针对涉及单一目标的表达式，即一个表达式对应唯一特定物体。涉及多个目标或不涉及特定目标的表达式尚未被考虑。这一限制阻碍了REC的实际应用。本研究提出一项新基准，称为广义指代表达式理解（GREC）。该基准通过允许表达式描述任意数量的目标物体，扩展了经典REC。为实现这一目标，我们构建了首个大规模GREC数据集gRefCOCO。该数据集涵盖多种表达式：指代多个目标的表达式、无特定目标的表达式以及单一目标表达式。GREC与gRefCOCO的设计确保了与经典REC的平滑兼容。所提出的gRefCOCO数据集、GREC方法实现代码及GREC评估代码均可在https://github.com/henghuiding/gRefCOCO获取。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日