Composed Image Retrieval (CIR) has demonstrated significant potential by enabling flexible multimodal queries that combine a reference image and modification text. However, CIR inherently prioritizes semantic matching, struggling to reliably retrieve a user-specified instance across contexts. In practice, emphasizing concrete instance fidelity over broad semantics is often more consequential. In this work, we propose Object-Anchored Composed Image Retrieval (OACIR), a novel fine-grained retrieval task that mandates strict instance-level consistency. To advance research on this task, we construct OACIRR (OACIR on Real-world images), the first large-scale, multi-domain benchmark comprising over 160K quadruples and four challenging candidate galleries enriched with hard-negative instance distractors. Each quadruple augments the compositional query with a bounding box that visually anchors the object in the reference image, providing a precise and flexible way to ensure instance preservation. To address the OACIR task, we propose AdaFocal, a framework featuring a Context-Aware Attention Modulator that adaptively intensifies attention within the specified instance region, dynamically balancing focus between the anchored instance and the broader compositional context. Extensive experiments demonstrate that AdaFocal substantially outperforms existing compositional retrieval models, particularly in maintaining instance-level fidelity, thereby establishing a robust baseline for this challenging task while opening new directions for more flexible, instance-aware retrieval systems.
翻译:组合图像检索(CIR)通过结合参考图像与修改文本的多模态查询展现出显著的潜力。然而,CIR本质上优先考虑语义匹配,难以可靠地在不同上下文中检索用户指定的实例。实践中,强调具体实例的保真度往往比宽泛的语义更为关键。为此,本文提出面向对象锚定的组合图像检索(OACIR),一种要求严格实例级别一致性的新型细粒度检索任务。为推进该任务研究,我们构建了OACIRR(基于真实图像的OACIR),这是首个大规模、多领域基准数据集,包含超过16万个四元组及四个充满硬负例干扰物的挑战性候选图库。每个四元组在组合查询中附加一个边界框,以视觉方式锚定参考图像中的对象,从而提供精确且灵活的方式来确保实例保存。为解决OACIR任务,我们提出AdaFocal框架,其核心为上下文感知注意力调制器,可自适应增强指定实例区域内的注意力,动态平衡锚定实例与全局组合上下文之间的聚焦。大量实验表明,AdaFocal在维持实例级保真度方面显著优于现有组合检索模型,为这一挑战性任务建立了稳健基线,同时为更灵活的实例感知检索系统开辟了新方向。