CasiMedicos-Arg: A Medical Question Answering Dataset Annotated with Explanatory Argumentative Structures

Explaining Artificial Intelligence (AI) decisions is a major challenge nowadays in AI, in particular when applied to sensitive scenarios like medicine and law. However, the need to explain the rationale behind decisions is a main issue also for human-based deliberation as it is important to justify \textit{why} a certain decision has been taken. Resident medical doctors for instance are required not only to provide a (possibly correct) diagnosis, but also to explain how they reached a certain conclusion. Developing new tools to aid residents to train their explanation skills is therefore a central objective of AI in education. In this paper, we follow this direction, and we present, to the best of our knowledge, the first multilingual dataset for Medical Question Answering where correct and incorrect diagnoses for a clinical case are enriched with a natural language explanation written by doctors. These explanations have been manually annotated with argument components (i.e., premise, claim) and argument relations (i.e., attack, support), resulting in the Multilingual CasiMedicos-Arg dataset which consists of 558 clinical cases in four languages (English, Spanish, French, Italian) with explanations, where we annotated 5021 claims, 2313 premises, 2431 support relations, and 1106 attack relations. We conclude by showing how competitive baselines perform over this challenging dataset for the argument mining task.

翻译：解释人工智能（AI）决策是当前AI领域的一项重大挑战，尤其是在应用于医学和法律等敏感场景时。然而，阐明决策背后的理由也是人类审议过程中的一个核心问题，因为证明为何做出某项决策至关重要。例如，住院医师不仅需要提供（可能正确的）诊断，还必须解释他们是如何得出特定结论的。因此，开发新工具以帮助住院医师训练其解释能力，是AI在教育领域的一个核心目标。本文沿此方向，据我们所知，首次提出了一个多语言医学问答数据集，其中临床案例的正确与错误诊断均附有医生撰写的自然语言解释。这些解释已通过人工标注了论证成分（即前提、主张）和论证关系（即攻击、支持），从而形成了多语言CasiMedicos-Arg数据集。该数据集包含558个临床案例，涵盖四种语言（英语、西班牙语、法语、意大利语）及相应解释，我们共标注了5021个主张、2313个前提、2431个支持关系和1106个攻击关系。最后，我们展示了在该具有挑战性的论证挖掘任务数据集上，竞争性基线模型的性能表现。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日