BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

Medical Vision-Language Pretraining (MedVLP) shows promise in learning generalizable and transferable visual representations from paired and unpaired medical images and reports. MedVLP can provide useful features to downstream tasks and facilitate adapting task-specific models to new setups using fewer examples. However, existing MedVLP methods often differ in terms of datasets, preprocessing, and finetuning implementations. This pose great challenges in evaluating how well a MedVLP method generalizes to various clinically-relevant tasks due to the lack of unified, standardized, and comprehensive benchmark. To fill this gap, we propose BenchX, a unified benchmark framework that enables head-to-head comparison and systematical analysis between MedVLP methods using public chest X-ray datasets. Specifically, BenchX is composed of three components: 1) Comprehensive datasets covering nine datasets and four medical tasks; 2) Benchmark suites to standardize data preprocessing, train-test splits, and parameter selection; 3) Unified finetuning protocols that accommodate heterogeneous MedVLP methods for consistent task adaptation in classification, segmentation, and report generation, respectively. Utilizing BenchX, we establish baselines for nine state-of-the-art MedVLP methods and found that the performance of some early MedVLP methods can be enhanced to surpass more recent ones, prompting a revisiting of the developments and conclusions from prior works in MedVLP. Our code are available at https://github.com/yangzhou12/BenchX.

翻译：医学视觉-语言预训练（MedVLP）在从成对及非成对的医学影像与报告中学习可泛化、可迁移的视觉表征方面展现出潜力。MedVLP能够为下游任务提供有用的特征，并有助于使用更少的样本使任务专用模型适应新的场景。然而，现有的MedVLP方法通常在数据集、预处理及微调实现方面存在差异。由于缺乏统一、标准化且全面的基准，这给评估MedVLP方法在各类临床相关任务上的泛化能力带来了巨大挑战。为填补这一空白，我们提出了BenchX，一个统一的基准框架，能够利用公开的胸部X光数据集对MedVLP方法进行直接比较与系统分析。具体而言，BenchX包含三个组成部分：1）涵盖九个数据集和四项医疗任务的综合数据集；2）用于标准化数据预处理、训练-测试划分及参数选择的基准套件；3）统一的微调协议，可适配异构的MedVLP方法，分别实现分类、分割和报告生成任务的一致适应。利用BenchX，我们为九种先进的MedVLP方法建立了基线，并发现一些早期MedVLP方法的性能可通过优化提升至超越近期方法，这促使我们重新审视先前MedVLP研究的发展历程与结论。我们的代码发布于 https://github.com/yangzhou12/BenchX。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日