Medical triage is the task of allocating medical resources and prioritizing patients based on medical need. This paper introduces the first large-scale public dataset for studying medical triage in the context of asynchronous outpatient portal messages. Our novel task formulation views patient message triage as a pairwise inference problem, where we train LLMs to choose `"which message is more medically urgent" in a head-to-head tournament-style re-sort of a physician's inbox. Our novel benchmark PMR-Bench contains 1569 unique messages and 2,000+ high-quality test pairs for pairwise medical urgency assessment alongside a scalable training data generation pipeline. PMR-Bench includes samples that contain both unstructured patient-written messages alongside real electronic health record (EHR) data, emulating a real-world medical triage scenario. We develop a novel automated data annotation strategy to provide LLMs with in-domain guidance on this task. The resulting data is used to train two model classes, UrgentReward and UrgentSFT, leveraging Bradley-Terry and next token prediction objective, respectively to perform pairwise urgency classification. We find that UrgentSFT achieves top performance on PMR-Bench, with UrgentReward showing distinct advantages in low-resource settings. For example, UrgentSFT-8B and UrgentReward-8B provide a 15- and 16-point boost, respectively, on inbox sorting metrics over off-the-shelf 8B models. Paper resources can be found at https://tinyurl.com/Patient-Message-Triage
翻译:医疗分诊是根据医疗需求分配医疗资源和确定患者优先级的任务。本文首次引入用于研究异步门诊门户消息场景下医疗分诊的大规模公开数据集。我们提出的新颖任务框架将患者消息分诊视为成对推理问题,通过训练大语言模型在医生收件箱的锦标赛式重排序中选择"哪条消息医疗紧急程度更高"。我们构建的新基准PMR-Bench包含1569条独立消息和2000余个高质量测试对用于成对医疗紧急度评估,同时提供可扩展的训练数据生成流程。该基准包含同时具有非结构化患者书写消息和真实电子健康记录数据的样本,模拟真实世界医疗分诊场景。我们开发了创新的自动数据标注策略,为大语言模型提供该任务的领域内指导。基于生成的数据,我们训练了UrgentReward和UrgentSFT两类模型,分别利用Bradley-Terry模型和下一词预测目标函数执行成对紧急度分类。实验发现UrgentSFT在PMR-Bench上表现最优,而UrgentReward在低资源场景中展现出独特优势。例如,UrgentSFT-8B和UrgentReward-8B在收件箱排序指标上分别比现成的8B模型提升15和16个百分点。论文资源详见https://tinyurl.com/Patient-Message-Triage