Interactive segmentation methods rely on user inputs to iteratively update the selection mask. A click specifying the object of interest is arguably the most simple and intuitive interaction type, and thereby the most common choice for interactive segmentation. However, user clicking patterns in the interactive segmentation context remain unexplored. Accordingly, interactive segmentation evaluation strategies rely more on intuition and common sense rather than empirical studies (e.g., assuming that users tend to click in the center of the area with the largest error). In this work, we conduct a real user study to investigate real user clicking patterns. This study reveals that the intuitive assumption made in the common evaluation strategy may not hold. As a result, interactive segmentation models may show high scores in the standard benchmarks, but it does not imply that they would perform well in a real world scenario. To assess the applicability of interactive segmentation methods, we propose a novel evaluation strategy providing a more comprehensive analysis of a model's performance. To this end, we propose a methodology for finding extreme user inputs by a direct optimization in a white-box adversarial attack on the interactive segmentation model. Based on the performance with such adversarial user inputs, we assess the robustness of interactive segmentation models w.r.t click positions. Besides, we introduce a novel benchmark for measuring the robustness of interactive segmentation, and report the results of an extensive evaluation of dozens of models.
翻译:交互式分割方法依赖用户输入来迭代更新选择掩膜。通过点击指定目标物体是最简单直观的交互方式,因此成为交互式分割中最常见的选择。然而,交互式分割场景下的用户点击模式仍未被探索。相应地,交互式分割的评估策略更多依赖直觉和常识而非实证研究(例如,假设用户倾向于点击误差最大区域的中心)。本文通过真实用户研究调查实际点击模式,发现常见评估策略中的直观假设可能不成立。因此,交互式分割模型在标准基准测试中得分较高,但这并不意味着其在真实场景中表现良好。为评估交互式分割方法的适用性,我们提出一种新型评估策略,能更全面地分析模型性能。为此,我们提出一种方法,通过直接优化对交互式分割模型的白盒对抗攻击来寻找极端用户输入。基于此类对抗性用户输入的性能表现,我们评估了交互式分割模型在点击位置方面的鲁棒性。此外,我们引入了一种衡量交互式分割鲁棒性的新基准,并报告了数十个模型的广泛评估结果。