Medical researchers and clinicians often need to perform novel segmentation tasks on a set of related images. Existing methods for segmenting a new dataset are either interactive, requiring substantial human effort for each image, or require an existing set of manually labeled images. We introduce a system, MultiverSeg, that enables practitioners to rapidly segment an entire new dataset without requiring access to any existing labeled data from that task or domain. Along with the image to segment, the model takes user interactions such as clicks, bounding boxes or scribbles as input, and predicts a segmentation. As the user segments more images, those images and segmentations become additional inputs to the model, providing context. As the context set of labeled images grows, the number of interactions required to segment each new image decreases. We demonstrate that MultiverSeg enables users to interactively segment new datasets efficiently, by amortizing the number of interactions per image to achieve an accurate segmentation. Compared to using a state-of-the-art interactive segmentation method, using MultiverSeg reduced the total number of scribble steps by 53% and clicks by 36% to achieve 90% Dice on sets of images from unseen tasks. We release code and model weights at https://multiverseg.csail.mit.edu
翻译:医学研究人员和临床医生经常需要对一组相关图像执行新的分割任务。现有分割新数据集的方法要么是交互式的(每幅图像需要大量人工操作),要么需要已有的人工标注图像集。我们提出了一个名为MultiverSeg的系统,使从业者能够快速分割整个新数据集,而无需访问该任务或领域的任何现有标注数据。该模型以待分割图像及用户交互(如点击、边界框或涂鸦)作为输入,并预测分割结果。随着用户分割更多图像,这些图像及其分割结果将作为额外输入提供给模型,形成上下文信息。随着标注图像上下文集的扩大,分割每幅新图像所需的交互次数逐渐减少。我们证明,通过分摊每幅图像所需的交互次数以实现精确分割,MultiverSeg能够使用户高效地交互式分割新数据集。与使用最先进的交互式分割方法相比,在未见任务图像集上达到90% Dice系数时,MultiverSeg将涂鸦步骤总数减少了53%,点击次数减少了36%。代码和模型权重发布于https://multiverseg.csail.mit.edu。