Traditional change detection identifies where changes occur, but does not explain what changed in natural language. Existing remote sensing change captioning datasets typically describe overall image-level differences, leaving fine-grained localized semantic reasoning largely unexplored. To close this gap, we present RSRCC, a new benchmark for remote sensing change question-answering containing 126k questions, split into 87k training, 17.1k validation, and 22k test instances. Unlike prior datasets, RSRCC is built around localized, change-specific questions that require reasoning about a particular semantic change. To the best of our knowledge, this is the first remote sensing change question-answering benchmark designed explicitly for such fine-grained reasoning-based supervision. To construct RSRCC, we introduce a hierarchical semi-supervised curation pipeline that uses Best-of-N ranking as a critical final ambiguity-resolution stage. First, candidate change regions are extracted from semantic segmentation masks, then initially screened using an image-text embedding model, and finally validated through retrieval-augmented vision-language curation with Best-of-N ranking. This process enables scalable filtering of noisy and ambiguous candidates while preserving semantically meaningful changes. The dataset is available at https://huggingface.co/datasets/google/RSRCC.
翻译:传统的遥感变化检测能够识别变化发生的区域,但无法用自然语言解释变化内容。现有的遥感变化描述数据集通常描述全局图像层面的差异,而对细粒度局部语义推理的探究仍显不足。为弥补这一空白,我们提出了RSRCC——一个全新的遥感变化问答基准数据集,包含12.6万道问题,划分为8.7万训练实例、1.71万验证实例和2.2万测试实例。与以往数据集不同,RSRCC围绕局部化的、特定变化的问题构建,要求对特定语义变化进行推理。据我们所知,这是首个专门为这种细粒度推理监督任务设计的遥感变化问答基准。为构建RSRCC,我们引入了一种层次化半监督标注流程,将Best-of-N排序作为关键的最终歧义消除阶段。首先从语义分割掩码中提取候选变化区域,随后通过图像-文本嵌入模型进行初步筛选,最终通过基于检索增强的视觉-语言标注过程(结合Best-of-N排序)进行验证。该流程能够在保留语义有效变化的同时,实现对噪声和歧义候选的大规模过滤。数据集可通过 https://huggingface.co/datasets/google/RSRCC 获取。