Real-time computational speed and a high degree of precision are requirements for computer-assisted interventions. Applying a segmentation network to a medical video processing task can introduce significant inter-frame prediction noise. Existing approaches can reduce inconsistencies by including temporal information but often impose requirements on the architecture or dataset. This paper proposes a method to include temporal information in any segmentation model and, thus, a technique to improve video segmentation performance without alterations during training or additional labeling. With Motion-Corrected Moving Average, we refine the exponential moving average between the current and previous predictions. Using optical flow to estimate the movement between consecutive frames, we can shift the prior term in the moving-average calculation to align with the geometry of the current frame. The optical flow calculation does not require the output of the model and can therefore be performed in parallel, leading to no significant runtime penalty for our approach. We evaluate our approach on two publicly available segmentation datasets and two proprietary endoscopic datasets and show improvements over a baseline approach.
翻译:实时计算速度和高精度是计算机辅助干预的必备条件。将分割网络应用于医学视频处理任务时,可能引入显著的帧间预测噪声。现有方法可通过纳入时间信息减少不一致性,但通常对架构或数据集提出额外要求。本文提出了一种将时间信息融入任意分割模型的方法,从而在不改变训练过程或增加标注的情况下提升视频分割性能。通过运动校正移动平均,我们改进了当前预测与先前预测之间的指数移动平均。利用光流估计相邻帧之间的运动,可移动移动平均计算中的先验项,使其与当前帧的几何结构对齐。光流计算不依赖模型输出,因此可并行执行,使我们的方法几乎不产生显著的运行时开销。我们在两个公开分割数据集和两个专有内窥镜数据集上评估了该方法,并证明了其相较于基线方法的改进。