CohortGPT: An Enhanced GPT for Participant Recruitment in Clinical Study

Participant recruitment based on unstructured medical texts such as clinical notes and radiology reports has been a challenging yet important task for the cohort establishment in clinical research. Recently, Large Language Models (LLMs) such as ChatGPT have achieved tremendous success in various downstream tasks thanks to their promising performance in language understanding, inference, and generation. It is then natural to test their feasibility in solving the cohort recruitment task, which involves the classification of a given paragraph of medical text into disease label(s). However, when applied to knowledge-intensive problem settings such as medical text classification, where the LLMs are expected to understand the decision made by human experts and accurately identify the implied disease labels, the LLMs show a mediocre performance. A possible explanation is that, by only using the medical text, the LLMs neglect to use the rich context of additional information that languages afford. To this end, we propose to use a knowledge graph as auxiliary information to guide the LLMs in making predictions. Moreover, to further boost the LLMs adapt to the problem setting, we apply a chain-of-thought (CoT) sample selection strategy enhanced by reinforcement learning, which selects a set of CoT samples given each individual medical report. Experimental results and various ablation studies show that our few-shot learning method achieves satisfactory performance compared with fine-tuning strategies and gains superb advantages when the available data is limited. The code and sample dataset of the proposed CohortGPT model is available at: https://anonymous.4open.science/r/CohortGPT-4872/

翻译：[translated abstract in Chinese] 基于非结构化医疗文本（如临床记录和放射学报告）的受试者招募一直是临床研究中队列建立面临挑战但至关重要的工作。近年来，大型语言模型（LLMs）如ChatGPT凭借其在语言理解、推理和生成方面的卓越性能，已在各类下游任务中取得巨大成功。因此，自然需要检验其在解决队列招募任务中的可行性——该任务涉及将给定的医疗文本段落分类为疾病标签。然而，当应用于知识密集型问题场景（例如医疗文本分类）时，期望LLMs理解人类专家决策并准确识别隐含疾病标签，但LLMs的表现却较为平庸。一个可能的解释是，仅通过使用医疗文本，LLMs忽略了语言所提供的丰富上下文附加信息。为此，我们提出使用知识图谱作为辅助信息来指导LLMs进行预测。此外，为进一步提升LLMs对问题场景的适应性，我们应用了一种基于强化学习的思维链（CoT）样本选择策略，该策略可针对每份医疗报告选择一组CoT样本。实验结果与各项消融研究表明，与微调策略相比，我们的少样本学习方法在性能上令人满意，并且在可用数据有限时展现出显著优势。所提出的CohortGPT模型的代码与样本数据集可访问：https://anonymous.4open.science/r/CohortGPT-4872/