Clinical texts, represented in electronic medical records (EMRs), contain rich medical information and are essential for disease prediction, personalised information recommendation, clinical decision support, and medication pattern mining and measurement. Relation extractions between medication mentions and temporal information can further help clinicians better understand the patients' treatment history. To evaluate the performances of deep learning (DL) and large language models (LLMs) in medication extraction and temporal relations classification, we carry out an empirical investigation of \textbf{MedTem} project using several advanced learning structures including BiLSTM-CRF and CNN-BiLSTM for a clinical domain named entity recognition (NER), and BERT-CNN for temporal relation extraction (RE), in addition to the exploration of different word embedding techniques. Furthermore, we also designed a set of post-processing roles to generate structured output on medications and the temporal relation. Our experiments show that CNN-BiLSTM slightly wins the BiLSTM-CRF model on the i2b2-2009 clinical NER task yielding 75.67, 77.83, and 78.17 for precision, recall, and F1 scores using Macro Average. BERT-CNN model also produced reasonable evaluation scores 64.48, 67.17, and 65.03 for P/R/F1 using Macro Avg on the temporal relation extraction test set from i2b2-2012 challenges. Code and Tools from MedTem will be hosted at \url{https://github.com/HECTA-UoM/MedTem}
翻译:电子病历中的临床文本包含丰富的医学信息,对于疾病预测、个性化信息推荐、临床决策支持以及用药模式挖掘与评估至关重要。抽取药物提及与时间信息之间的语义关系,可进一步帮助临床医生更全面理解患者的治疗史。为评估深度学习与大语言模型在药物抽取和时间关系分类中的性能,我们基于**MedTem**项目开展实证研究,采用多种先进学习结构,包括用于临床领域命名实体识别的BiLSTM-CRF和CNN-BiLSTM,以及用于时间关系抽取的BERT-CNN,同时探索不同词嵌入技术。此外,我们设计了一套后处理规则,用于生成药物及时间关系的结构化输出。实验结果表明,在i2b2-2009临床命名实体识别任务中,CNN-BiLSTM模型以Macro Average计算的精确率、召回率和F1分数(75.67、77.83、78.17)略优于BiLSTM-CRF模型;在i2b2-2012竞赛的时间关系抽取测试集中,BERT-CNN模型同样获得了合理的评估分数(Macro Avg P/R/F1分别为64.48、67.17、65.03)。MedTem项目的代码与工具已托管于\url{https://github.com/HECTA-UoM/MedTem}。