Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials based on longitudinal patient electronic health records (EHR) data and eligibility criteria of clinical trials. However, they either depend on trial-specific expert rules that cannot expand to other trials or perform matching at a very general level with a black-box model where the lack of interpretability makes the model results difficult to be adopted. To provide accurate and interpretable patient trial matching, we introduce a personalized dynamic tree-based memory network model named TREEMENT. It utilizes hierarchical clinical ontologies to expand the personalized patient representation learned from sequential EHR data, and then uses an attentional beam-search query learned from eligibility criteria embedding to offer a granular level of alignment for improved performance and interpretability. We evaluated TREEMENT against existing models on real-world datasets and demonstrated that TREEMENT outperforms the best baseline by 7% in terms of error reduction in criteria-level matching and achieves state-of-the-art results in its trial-level matching ability. Furthermore, we also show TREEMENT can offer good interpretability to make the model results easier for adoption.
翻译:临床试验对药物开发至关重要,但常因昂贵且低效的患者招募而受限。近年来,机器学习模型通过基于纵向患者电子健康记录(EHR)数据和临床试验入组标准,自动匹配患者与试验,以加速患者招募。然而,这些模型要么依赖无法推广至其他试验的特定专家规则,要么通过黑箱模型在非常笼统的层面进行匹配,缺乏可解释性导致模型结果难以被采用。为提供准确且可解释的患者-试验匹配,我们提出了一种名为TREEMENT的个性化动态树记忆网络模型。该模型利用分层临床本体扩展从时序EHR数据中学习到的个性化患者表示,再通过从入组标准嵌入中学习的注意力束搜索查询,实现细粒度对齐以提高性能与可解释性。我们在真实世界数据集上将TREEMENT与现有模型进行对比,结果表明:在标准层面匹配的误差减少上,TREEMENT比最优基线模型提升7%;在试验层面匹配能力上,其达到当前最优结果。此外,我们还验证了TREEMENT能提供良好的可解释性,使模型结果更易于被采用。