Information retrieval lies at the foundation of the modern digital industry. While natural language search has seen dramatic progress in recent years largely driven by embedding-based models and large-scale pretraining, the field still faces significant challenges. Specifically, queries that involve complex relationships, object compositions, or precise constraints such as identities, counts and proportions often remain unresolved or unreliable within current frameworks. In this paper, we propose a novel framework that integrates formal verification into deep learning-based image retrieval through a synergistic combination of graph-based verification methods and neural code generation. Our approach aims to support open-vocabulary natural language queries while producing results that are both trustworthy and verifiable. By grounding retrieval results in a system of formal reasoning, we move beyond the ambiguity and approximation that often characterize vector representations. Instead of accepting uncertainty as a given, our framework explicitly verifies each atomic truth in the user query against the retrieved content. This allows us to not only return matching results, but also to identify and mark which specific constraints are satisfied and which remain unmet, thereby offering a more transparent and accountable retrieval process while boosting the results of the most popular embedding-based approaches.
翻译:信息检索是现代数字产业的基础。尽管近年来基于嵌入模型和大规模预训练的自然语言搜索取得了显著进展,但该领域仍面临重大挑战。具体而言,涉及复杂关系、对象组合或精确约束(如身份、数量和比例)的查询在当前框架中往往无法解决或不可靠。本文提出了一种新颖框架,通过图基验证方法与神经代码生成的协同结合,将形式化验证集成到基于深度学习的图像检索中。我们的方法旨在支持开放词汇的自然语言查询,同时产生可信且可验证的结果。通过将检索结果建立在形式化推理系统之上,我们超越了向量表示中常见的模糊性和近似性。我们的框架不是将不确定性视为既定事实,而是针对检索内容明确验证用户查询中的每个原子真值。这使我们不仅能够返回匹配结果,还能识别并标记哪些具体约束得到满足、哪些仍未满足,从而提供更透明、更负责任的检索过程,同时提升最流行的基于嵌入方法的检索效果。