The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders. Efforts towards automated ICD coding are dominated by supervised deep learning models. However, difficulties in learning to predict the large number of rare codes remain a barrier to adoption in clinical practice. In this work, we leverage off-the-shelf pre-trained generative large language models (LLMs) to develop a practical solution that is suitable for zero-shot and few-shot code assignment. Unsupervised pre-training alone does not guarantee precise knowledge of the ICD ontology and specialist clinical coding task, therefore we frame the task as information extraction, providing a description of each coded concept and asking the model to retrieve related mentions. For efficiency, rather than iterating over all codes, we leverage the hierarchical nature of the ICD ontology to sparsely search for relevant codes. Then, in a second stage, which we term 'meta-refinement', we utilise GPT-4 to select a subset of the relevant labels as predictions. We validate our method using Llama-2, GPT-3.5 and GPT-4 on the CodiEsp dataset of ICD-coded clinical case documents. Our tree-search method achieves state-of-the-art performance on rarer classes, achieving the best macro-F1 of 0.225, whilst achieving slightly lower micro-F1 of 0.157, compared to 0.216 and 0.219 respectively from PLM-ICD. To the best of our knowledge, this is the first method for automated ICD coding requiring no task-specific learning.
翻译:将患者入院诊断分配ICD代码的任务通常由专业人类编码员完成。自动化ICD编码的研究主要由监督式深度学习模型主导,但学习预测大量罕见代码的困难仍阻碍其在临床实践中的应用。本研究利用现成的预训练生成式大型语言模型(LLMs),开发出适用于零样本和少样本代码分配的实用解决方案。由于无监督预训练本身无法保证对ICD本体论和专业化临床编码任务的精确认知,我们将任务构建为信息抽取问题:提供每个编码概念描述,并引导模型检索相关提及内容。为提高效率,我们利用ICD本体论的层级结构进行稀疏式相关代码搜索,而非遍历所有代码。随后在第二阶段(称为"元精炼"),利用GPT-4从相关标签中筛选出预测子集。我们在包含ICD编码临床病例文档的CodiEsp数据集上,使用Llama-2、GPT-3.5和GPT-4验证了该方法。我们的树搜索方法在罕见类别上实现了最优性能,宏平均F1值达0.225;而微平均F1值为0.157,略低于PLM-ICD的0.216和0.219。据我们所知,这是首个无需任务特定学习的自动化ICD编码方法。