Amodal recognition is the ability of the system to detect occluded objects. Most SOTA Visual Recognition systems lack the ability to perform amodal recognition. Few studies have achieved amodal recognition through passive prediction or embodied recognition approaches. However, these approaches suffer from challenges in real-world applications, such as dynamic obstacles. We propose SeekNet, an improved optimization method for amodal recognition through embodied visual recognition. Additionally, we implement SeekNet for social robots, where there are multiple interactions with crowded pedestrians. We also demonstrate the benefits of our algorithm on occluded human detection and tracking over other baselines. Additionally, we set up a multi-robot environment with SeekNet to identify and track visual disease markers for airborne disease in crowded areas. We conduct our experiments in a simulated indoor environment and show that our method enhances the overall accuracy of the amodal recognition task and achieves the largest improvement in detection accuracy over time in comparison to the baseline approaches.
翻译:非模态识别是指系统检测被遮挡物体的能力。大多数最先进的视觉识别系统缺乏执行非模态识别的能力。少数研究通过被动预测或具身识别方法实现了非模态识别。然而,这些方法在现实应用中面临诸多挑战,例如动态障碍物。我们提出SeekNet,一种通过具身视觉识别实现非模态识别的改进优化方法。此外,我们将SeekNet应用于社交机器人场景,该场景需与密集行人进行多次交互。我们展示了该算法在被遮挡人体检测与跟踪任务上相较于其他基线方法的优势。同时,我们基于SeekNet搭建了一个多机器人环境,用于在密集区域识别并跟踪空气传播疾病的视觉病征标记。我们在模拟室内环境中进行实验,结果表明:相比基线方法,我们的方法提升了非模态识别任务的整体准确率,并在检测精度随时间变化方面取得了最大幅度的改进。