Electronic health records (EHR) even though a boon for healthcare practitioners, are growing convoluted and longer every day. Sifting around these lengthy EHRs is taxing and becomes a cumbersome part of physician-patient interaction. Several approaches have been proposed to help alleviate this prevalent issue either via summarization or sectioning, however, only a few approaches have truly been helpful in the past. With the rise of automated methods, machine learning (ML) has shown promise in solving the task of identifying relevant sections in EHR. However, most ML methods rely on labeled data which is difficult to get in healthcare. Large language models (LLMs) on the other hand, have performed impressive feats in natural language processing (NLP), that too in a zero-shot manner, i.e. without any labeled data. To that end, we propose using LLMs to identify relevant section headers. We find that GPT-4 can effectively solve the task on both zero and few-shot settings as well as segment dramatically better than state-of-the-art methods. Additionally, we also annotate a much harder real world dataset and find that GPT-4 struggles to perform well, alluding to further research and harder benchmarks.
翻译:电子健康记录(EHR)虽为医疗从业者带来福音,却日益复杂冗长。在浩如烟海的EHR中筛选信息不仅耗费精力,更成为医患互动中的累赘环节。为缓解这一普遍难题,学界提出了多种方法,或通过摘要生成,或通过章节划分,但过往真正行之有效的方法寥寥无几。随着自动化技术的发展,机器学习(ML)在EHR相关章节识别任务中展现出潜力。然而,大多数ML方法依赖标注数据,而在医疗领域获取标注数据极为困难。相比之下,大语言模型(LLM)在自然语言处理(NLP)领域表现惊艳,尤其在零样本场景下——即无需任何标注数据即可完成任务。为此,我们提出利用LLM识别相关章节标题。实验表明,GPT-4在零样本和少样本设定下均能高效完成任务,其分段效果远优于现有最优方法。此外,我们在更贴近现实、难度显著提升的数据集上进行标注后发现,GPT-4在此场景下表现不佳,这提示我们需要进一步研究并构建更具挑战性的基准测试。