Medical Triage as Pairwise Ranking: A Benchmark for Urgency in Patient Portal Messages

Medical triage is the task of allocating medical resources and prioritizing patients based on medical need. This paper introduces the first large-scale public dataset for studying medical triage in the context of asynchronous outpatient portal messages. Our novel task formulation views patient message triage as a pairwise inference problem, where we train LLMs to choose `"which message is more medically urgent" in a head-to-head tournament-style re-sort of a physician's inbox. Our novel benchmark PMR-Bench contains 1569 unique messages and 2,000+ high-quality test pairs for pairwise medical urgency assessment alongside a scalable training data generation pipeline. PMR-Bench includes samples that contain both unstructured patient-written messages alongside real electronic health record (EHR) data, emulating a real-world medical triage scenario. We develop a novel automated data annotation strategy to provide LLMs with in-domain guidance on this task. The resulting data is used to train two model classes, UrgentReward and UrgentSFT, leveraging Bradley-Terry and next token prediction objective, respectively to perform pairwise urgency classification. We find that UrgentSFT achieves top performance on PMR-Bench, with UrgentReward showing distinct advantages in low-resource settings. For example, UrgentSFT-8B and UrgentReward-8B provide a 15- and 16-point boost, respectively, on inbox sorting metrics over off-the-shelf 8B models. Paper resources can be found at https://tinyurl.com/Patient-Message-Triage

翻译：医疗分诊是根据医疗需求分配医疗资源和确定患者优先级的任务。本文首次引入用于研究异步门诊门户消息场景下医疗分诊的大规模公开数据集。我们提出的新颖任务框架将患者消息分诊视为成对推理问题，通过训练大语言模型在医生收件箱的锦标赛式重排序中选择"哪条消息医疗紧急程度更高"。我们构建的新基准PMR-Bench包含1569条独立消息和2000余个高质量测试对用于成对医疗紧急度评估，同时提供可扩展的训练数据生成流程。该基准包含同时具有非结构化患者书写消息和真实电子健康记录数据的样本，模拟真实世界医疗分诊场景。我们开发了创新的自动数据标注策略，为大语言模型提供该任务的领域内指导。基于生成的数据，我们训练了UrgentReward和UrgentSFT两类模型，分别利用Bradley-Terry模型和下一词预测目标函数执行成对紧急度分类。实验发现UrgentSFT在PMR-Bench上表现最优，而UrgentReward在低资源场景中展现出独特优势。例如，UrgentSFT-8B和UrgentReward-8B在收件箱排序指标上分别比现成的8B模型提升15和16个百分点。论文资源详见https://tinyurl.com/Patient-Message-Triage

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

U-Net如何用在医学图像分割？德国亚琛工大等最新《医学图像分割》综述，详述六大类100多个算法

专知会员服务

52+阅读 · 2022年11月29日