LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel LACOSTE model that exploits Location-Agnostic COntexts in Stereo and TEmporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.

翻译：手术器械分割对于微创手术及相关应用至关重要。先前多数方法将此任务视为基于单帧的实例分割，而忽略了手术视频天然的时序与立体属性。因此，这些方法在面对时序运动和视角变化带来的外观变化时鲁棒性不足。本研究提出了一种新颖的LACOSTE模型，通过利用立体图像与时序图像中的位置无关上下文来提升手术器械分割性能。以基于查询的分割模型为核心，我们设计了三个性能增强模块。首先，我们设计了一个视差引导的特征传播模块，以显式增强深度感知特征。为使其在仅单目视频条件下仍能良好泛化，我们采用伪立体方案生成互补的右视图图像。其次，我们提出了一个立体-时序集合分类器，以统一方式聚合立体-时序上下文以生成综合预测，并缓解瞬时失效问题。最后，我们提出了一种位置无关分类器，将位置偏差从掩码预测中解耦，并增强特征语义。我们在三个公开手术视频数据集上进行了广泛验证，包括两个来自EndoVis挑战赛的基准数据集和一个真实根治性前列腺切除术数据集GraSP。实验结果表明，我们的方法取得了优异的性能，在所有数据集上均达到或超越了现有先进方法的可比结果。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日