Ultrasound is a widely used imaging modality in clinical practice due to its low cost, portability, and safety. Current research in general AI for healthcare focuses on large language models and general segmentation models, with insufficient attention to solutions addressing both disease prediction and tissue segmentation. In this study, we propose a novel universal framework for ultrasound, namely UniUSNet, which is a promptable framework for ultrasound image classification and segmentation. The universality of this model is derived from its versatility across various aspects. It proficiently manages any ultrasound nature, any anatomical position, any input type and excelling not only in segmentation tasks but also in classification tasks. We introduce a novel module that incorporates this information as a prompt and seamlessly embedding it within the model's learning process. To train and validate our proposed model, we curated a comprehensive ultrasound dataset from publicly accessible sources, encompassing up to 7 distinct anatomical positions with over 9.7K annotations. Experimental results demonstrate that our model achieves performance comparable to state-of-the-art models, and surpasses both a model trained on a single dataset and an ablated version of the network lacking prompt guidance. Additionally, we conducted zero-shot and fine-tuning experiments on new datasets, which proved that our model possesses strong generalization capabilities and can be effectively adapted to new data at low cost through its adapter module. We will continuously expand the dataset and optimize the task specific prompting mechanism towards the universality in medical ultrasound. Model weights, data processing workflows, and code will be open source to the public (https://github.com/Zehui-Lin/UniUSNet).
翻译:超声因其低成本、便携性和安全性,已成为临床实践中广泛使用的成像模态。当前医疗通用人工智能的研究主要集中于大语言模型和通用分割模型,而对同时解决疾病预测和组织分割的方案关注不足。本研究提出了一种新颖的通用超声框架,即UniUSNet,这是一个用于超声图像分类与分割的可提示框架。该模型的通用性源于其在多方面的普适性:它能熟练处理任意超声性质、任意解剖部位、任意输入类型,不仅在分割任务中表现优异,在分类任务中同样出色。我们引入了一个新颖的模块,将这些信息作为提示词融入,并使其无缝嵌入模型的学习过程。为训练和验证所提模型,我们从公开来源整理了一个全面的超声数据集,涵盖多达7个不同解剖部位,包含超过9.7万个标注。实验结果表明,我们的模型取得了与最先进模型相当的性能,并超越了在单一数据集上训练的模型以及缺乏提示引导的消融网络版本。此外,我们在新数据集上进行了零样本和微调实验,证明我们的模型具有较强的泛化能力,并能通过其适配器模块以低成本有效适应新数据。我们将持续扩展数据集并优化任务特定的提示机制,以推进医学超声的通用性。模型权重、数据处理流程和代码将对公众开源(https://github.com/Zehui-Lin/UniUSNet)。