Leveraging Large Language Models for Knowledge-free Weak Supervision in Clinical Natural Language Processing

The performance of deep learning-based natural language processing systems is based on large amounts of labeled training data which, in the clinical domain, are not easily available or affordable. Weak supervision and in-context learning offer partial solutions to this issue, particularly using large language models (LLMs), but their performance still trails traditional supervised methods with moderate amounts of gold-standard data. In particular, inferencing with LLMs is computationally heavy. We propose an approach leveraging fine-tuning LLMs and weak supervision with virtually no domain knowledge that still achieves consistently dominant performance. Using a prompt-based approach, the LLM is used to generate weakly-labeled data for training a downstream BERT model. The weakly supervised model is then further fine-tuned on small amounts of gold standard data. We evaluate this approach using Llama2 on three different n2c2 datasets. With no more than 10 gold standard notes, our final BERT models weakly supervised by fine-tuned Llama2-13B consistently outperformed out-of-the-box PubMedBERT by 4.7% to 47.9% in F1 scores. With only 50 gold standard notes, our models achieved close performance to fully fine-tuned systems.

翻译：基于深度学习的自然语言处理系统性能依赖于大量标注训练数据，而在临床领域中，此类数据既不易获取也成本高昂。弱监督与上下文学习为此提供了部分解决方案，特别是利用大型语言模型，但其性能仍落后于使用适量黄金标准数据的传统监督方法。尤其值得注意的是，大型语言模型的推理计算负担沉重。我们提出一种方法，通过微调大型语言模型并结合几乎无需领域知识的弱监督，仍能实现持续领先的性能。该方法采用提示工程策略，利用大型语言模型生成弱标注数据以训练下游BERT模型。随后，该弱监督模型可在少量黄金标准数据上进一步微调。我们在三个不同的n2c2数据集上使用Llama2评估了该方法。实验表明，在不超过10份黄金标准病历的情况下，经微调Llama2-13B弱监督的最终BERT模型，其F1分数始终优于未经调整的PubMedBERT模型，提升幅度达4.7%至47.9%。当使用仅50份黄金标准病历时，我们的模型性能已接近完全微调系统。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日