RJUA-QA: A Comprehensive QA Dataset for Urology

Shiwei Lyu,Chenfei Chi,Hongbo Cai,Lei Shi,Xiaoyan Yang,Lei Liu,Xiang Chen,Deng Zhao,Zhiqiang Zhang,Xianguo Lyu,Ming Zhang,Fangzhou Li,Xiaowei Ma,Yue Shen,Jinjie Gu,Wei Xue,Yiran Huang

from arxiv, An initial version

We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Context-Answer pairs, corresponding about 25,000 diagnostic records and clinical cases. The dataset covers 67 common urological disease categories, where the disease coverage exceeds 97.6\% of the population seeking medical services in urology. Each data instance in RJUA-QA comprises: (1) a question mirroring real patient to inquiry about clinical symptoms and medical conditions, (2) a context including comprehensive expert knowledge, serving as a reference for medical examination and diagnosis, (3) a doctor response offering the diagnostic conclusion and suggested examination guidance, (4) a diagnosed clinical disease as the recommended diagnostic outcome, and (5) clinical advice providing recommendations for medical examination. RJUA-QA is the first medical QA dataset for clinical reasoning over the patient inquiries, where expert-level knowledge and experience are required for yielding diagnostic conclusions and medical examination advice. A comprehensive evaluation is conducted to evaluate the performance of both medical-specific and general LLMs on the RJUA-QA dataset. Our data is are publicly available at \url{https://github.com/alipay/RJU_Ant_QA}.

翻译：我们提出了RJUA-QA，这是一个新颖的医学问答与临床推理数据集，旨在弥合通用大语言模型与医学专用大语言模型应用之间的差距。RJUA-QA源自真实临床场景，旨在促进大语言模型生成可靠的诊断与建议。该数据集包含2,132对经过精心整理的问答-上下文对，对应约25,000份诊断记录和临床病例。数据集涵盖67种常见泌尿系统疾病类别，疾病覆盖范围超过寻求泌尿科医疗服务人群的97.6%。RJUA-QA中的每个数据实例包含：(1) 模拟真实患者咨询临床症状与病情的问题；(2) 包含综合专家知识的上下文，作为医学检查和诊断的参考；(3) 提供诊断结论和建议检查指导的医生回答；(4) 作为推荐诊断结果的已确诊临床疾病；以及(5) 提供医学检查建议的临床指导。RJUA-QA是首个针对患者咨询进行临床推理的医学问答数据集，需要专家级知识和经验来得出诊断结论和医学检查建议。我们进行了全面的评估，以评测医学专用和通用大语言模型在RJUA-QA数据集上的性能。我们的数据集已在\url{https://github.com/alipay/RJU_Ant_QA}公开提供。