Emerging foundation models in machine learning are models trained on vast amounts of data that have been shown to generalize well to new tasks. Often these models can be prompted with multi-modal inputs that range from natural language descriptions over images to point clouds. In this paper, we propose topological data analysis (TDA) guided prompt optimization for the Segment Anything Model (SAM) and show preliminary results in the biological image segmentation domain. Our approach replaces the standard grid search approach that is used in the original implementation and finds point locations based on their topological significance. Our results show that the TDA optimized point cloud is much better suited for finding small objects and massively reduces computational complexity despite the extra step in scenarios which require many segmentations.
翻译:新兴的机器学习基础模型是经过海量数据训练的模型,已被证明能很好地泛化至新任务。这类模型通常可通过多模态输入进行提示,包括自然语言描述、图像乃至点云。本文提出了一种基于拓扑数据分析(TDA)引导的提示优化方法,用于Segment Anything模型(SAM),并在生物图像分割领域展示了初步成果。我们的方法替代了原始实现中使用的标准网格搜索策略,根据点的拓扑显著性找到其位置。实验结果表明,TDA优化的点云更适合于小目标检测,且在对需要多次分割的场景中,尽管引入了额外步骤,仍能大幅降低计算复杂度。