The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders. Efforts towards automated ICD coding are dominated by supervised deep learning models. However, difficulties in learning to predict the large number of rare codes remain a barrier to adoption in clinical practice. In this work, we leverage off-the-shelf pre-trained generative large language models (LLMs) to develop a practical solution that is suitable for zero-shot and few-shot code assignment. Unsupervised pre-training alone does not guarantee precise knowledge of the ICD ontology and specialist clinical coding task, therefore we frame the task as information extraction, providing a description of each coded concept and asking the model to retrieve related mentions. For efficiency, rather than iterating over all codes, we leverage the hierarchical nature of the ICD ontology to sparsely search for relevant codes. Then, in a second stage, which we term 'meta-refinement', we utilise GPT-4 to select a subset of the relevant labels as predictions. We validate our method using Llama-2, GPT-3.5 and GPT-4 on the CodiEsp dataset of ICD-coded clinical case documents. Our tree-search method achieves state-of-the-art performance on rarer classes, achieving the best macro-F1 of 0.225, whilst achieving slightly lower micro-F1 of 0.157, compared to 0.216 and 0.219 respectively from PLM-ICD. To the best of our knowledge, this is the first method for automated ICD coding requiring no task-specific learning.
翻译:将患者住院期间的诊断ICD代码分配任务通常由专业人工编码员完成。自动化ICD编码的研究主要依赖监督式深度学习模型,然而,在预测大量罕见代码时存在的学习困难,仍是其临床实践应用中的障碍。本研究利用现成的预训练生成式大型语言模型,开发了一种适用于零样本和少样本代码分配的实用解决方案。无监督预训练本身无法保证模型对ICD本体及专业临床编码任务的精确认知,因此我们将该任务重构为信息抽取问题:为每个编码概念提供描述,并引导模型检索相关提及内容。为提高效率,我们利用ICD本体的层级结构,以稀疏搜索方式定位相关代码,而非遍历所有代码。在第二阶段(我们称之为“元精炼”),我们使用GPT-4从相关标签中筛选子集作为预测结果。我们在包含ICD编码临床病例文档的CodiEsp数据集上,使用Llama-2、GPT-3.5和GPT-4验证了该方法。我们的树搜索方法在罕见类别上实现了最先进性能,宏F1值达到0.225,同时微F1值为0.157(略低于PLM-ICD的0.216和0.219)。据我们所知,这是首个无需任务特定学习的自动化ICD编码方法。