SatIR: Scalable High-Recall Constraint-Satisfaction-Based Information Retrieval for Clinical Trials Matching

Many important retrieval problems are not merely problems of semantic similarity, but problems of constraint satisfaction: a retrieved item should be topically relevant to a query and satisfy explicit requirements involving negation, temporal conditions, numeric thresholds, exceptions, ontological relations, and incomplete evidence. We study this challenge in clinical trial matching, a high-stakes test bed where a useful trial must both address a patient's medical needs and satisfy complex eligibility criteria. We propose SatIR, a scalable constraint-based retrieval method for clinical trial matching. SatIR converts trial eligibility criteria and summaries into formal constraints, then retrieves patient--trial pairs by executing these constraints over a database. The system combines Satisfiability Modulo Theories (SMT), relational algebra, medical ontology grounding, and large language models (LLMs): formal methods provide executable and inspectable matching, while LLMs convert ambiguous, incomplete, and implicit clinical information into explicit, controllable constraint representations. Across the SIGIR 2016 patient--trial collection and TREC-2022-RetrievalSubset, a benchmark derived from TREC 2022, SATIR consistently improves eligibility-aware retrieval over similarity-based baselines. Relative to TrialGPT-style retrieval, SATIR retrieves 32%--72% more relevant-and-eligible trials per patient on SIGIR 2016 and achieves $1.8$--$3.2\times$ higher eligible-trial recall on TREC-2022-RetrievalSubset. Retrieval is fast, requiring only 146 milliseconds per patient over 3,621 SIGIR trials.

翻译：许多重要的检索问题不仅仅是语义相似性问题，而是约束满足问题：检索结果既需在主题上与查询相关，还需满足包含否定关系、时间条件、数值阈值、异常情况、本体论关系以及不完整证据在内的显式要求。我们在临床试验匹配这一高风险测试场景中研究该挑战——此类场景下，有用的试验既要满足患者医疗需求，又要符合复杂的入组标准。我们提出SatIR，一种面向临床试验匹配的可扩展基于约束的检索方法。SatIR将试验入组标准与摘要转化为形式化约束，随后通过数据库执行这些约束来检索患者-试验对。该系统结合可满足性模理论（SMT）、关系代数、医学本体论映射与大语言模型（LLM）：形式化方法提供可执行、可检查的匹配过程，而LLM则将模糊、不完整及隐式的临床信息转化为清晰可控的约束表示。在SIGIR 2016患者-试验数据集与源于TREC 2022的基准测试TREC-2022-RetrievalSubset上，SatIR相比基于相似度的基线方法持续提升考虑入组标准的检索效果。与TrialGPT风格的检索相比，SatIR在SIGIR 2016数据集上每位患者检索到的相关且符合条件的试验数量增加32%-72%，并在TREC-2022-RetrievalSubset上实现了1.8-3.2倍的符合条件试验召回率提升。检索速度高效，在包含3,621项试验的SIGIR数据集上，每位患者仅需146毫秒。