Question Answering (QA) systems on patient-related data can assist both clinicians and patients. They can, for example, assist clinicians in decision-making and enable patients to have a better understanding of their medical history. Significant amounts of patient data are stored in Electronic Health Records (EHRs), making EHR QA an important research area. In EHR QA, the answer is obtained from the medical record of the patient. Because of the differences in data format and modality, this differs greatly from other medical QA tasks that employ medical websites or scientific papers to retrieve answers, making it critical to research EHR question answering. This study aimed to provide a methodological review of existing works on QA over EHRs. We searched for articles from January 1st, 2005 to September 30th, 2023 in four digital sources including Google Scholar, ACL Anthology, ACM Digital Library, and PubMed to collect relevant publications on EHR QA. 4111 papers were identified for our study, and after screening based on our inclusion criteria, we obtained a total of 47 papers for further study. Out of the 47 papers, 25 papers were about EHR QA datasets, and 37 papers were about EHR QA models. It was observed that QA on EHRs is relatively new and unexplored. Most of the works are fairly recent. Also, it was observed that emrQA is by far the most popular EHR QA dataset, both in terms of citations and usage in other papers. Furthermore, we identified the different models used in EHR QA along with the evaluation metrics used for these models.
翻译:针对患者相关数据的问答系统能够同时辅助临床医生和患者。例如,它们可以帮助临床医生进行决策,并使患者更好地了解自身的病史。大量患者数据存储在电子健康记录(EHR)中,这使得EHR问答成为一个重要的研究领域。在EHR问答中,答案从患者的医疗记录中获取。由于数据格式和模态的差异,这与利用医学网站或科学论文检索答案的其他医学问答任务截然不同,因此对电子健康记录问答进行研究至关重要。本研究旨在对现有基于EHR的问答工作提供方法学综述。我们在包括Google Scholar、ACL Anthology、ACM Digital Library和PubMed在内的四个数字资源中检索了2005年1月1日至2023年9月30日期间的文章,以收集与EHR问答相关的出版物。共识别出4111篇论文用于研究,经根据纳入标准筛选后,最终获得47篇论文进行深入研究。在这47篇论文中,25篇涉及EHR问答数据集,37篇涉及EHR问答模型。研究发现,基于EHR的问答相对新颖且尚未充分开发,大多数工作均为近期成果。同时观察到,emrQA在引用次数及被其他论文使用频率方面,是截至目前最流行的EHR问答数据集。此外,我们梳理了EHR问答中使用的不同模型及其对应的评估指标。