Scalable High-Recall Constraint-Satisfaction-Based Information Retrieval for Clinical Trials Matching

Clinical trials are central to evidence-based medicine, yet many struggle to meet enrollment targets, despite the availability of over half a million trials listed on ClinicalTrials.gov, which attracts approximately two million users monthly. Existing retrieval techniques, largely based on keyword and embedding-similarity matching between patient profiles and eligibility criteria, often struggle with low recall, low precision, and limited interpretability due to complex constraints. We propose SatIR, a scalable clinical trial retrieval method based on constraint satisfaction, enabling high-precision and interpretable matching of patients to relevant trials. Our approach uses formal methods -- Satisfiability Modulo Theories (SMT) and relational algebra -- to efficiently represent and match key constraints from clinical trials and patient records. Beyond leveraging established medical ontologies and conceptual models, we use Large Language Models (LLMs) to convert informal reasoning regarding ambiguity, implicit clinical assumptions, and incomplete patient records into explicit, precise, controllable, and interpretable formal constraints. Evaluated on 59 patients and 3,621 trials, SatIR outperforms TrialGPT on all three evaluated retrieval objectives. It retrieves 32%-72% more relevant-and-eligible trials per patient, improves recall over the union of useful trials by 22-38 points, and serves more patients with at least one useful trial. Retrieval is fast, requiring 2.95 seconds per patient over 3,621 trials. These results show that SatIR is scalable, effective, and interpretable.

翻译：临床试验是循证医学的核心，但尽管ClinicalTrials.gov上列有超过50万项试验（每月吸引约200万用户），许多试验仍难以达到招募目标。现有的检索技术主要基于患者档案与入组标准之间的关键词和嵌入相似性匹配，常因复杂约束而面临低召回率、低精度及可解释性不足的问题。我们提出SatIR，一种基于约束满足的可扩展临床试验检索方法，能够实现患者与相关试验的高精度、可解释匹配。该方法采用形式化方法（即可满足性模理论SMT与关系代数）高效表示并匹配临床试验与患者记录中的关键约束。除利用既定医学本体和概念模型外，我们使用大语言模型（LLMs）将关于歧义、隐含临床假设及不完整患者记录的非正式推理转化为明确、精确、可控且可解释的形式化约束。在59名患者与3621项试验上的评估显示，SatIR在所有三项检索目标上均优于TrialGPT。它为每位患者多检索出32%-72%的相关且合格试验，在有用试验的联合集合上召回率提升22-38个百分点，并使更多患者获得至少一项有用试验。检索速度极快，在3621项试验中每位患者仅需2.95秒。这些结果表明SatIR具有可扩展性、有效性和可解释性。