Tracking of inventory and rearrangement of misplaced items are some of the most labor-intensive tasks in a retail environment. While there have been attempts at using vision-based techniques for these tasks, they mostly use planogram compliance for detection of any anomalies, a technique that has been found lacking in robustness and scalability. Moreover, existing systems rely on human intervention to perform corrective actions after detection. In this paper, we present Co-AD, a Concept-based Anomaly Detection approach using a Vision Transformer (ViT) that is able to flag misplaced objects without using a prior knowledge base such as a planogram. It uses an auto-encoder architecture followed by outlier detection in the latent space. Co-AD has a peak success rate of 89.90% on anomaly detection image sets of retail objects drawn from the RP2K dataset, compared to 80.81% on the best-performing baseline of a standard ViT auto-encoder. To demonstrate its utility, we describe a robotic mobile manipulation pipeline to autonomously correct the anomalies flagged by Co-AD. This work is ultimately aimed towards developing autonomous mobile robot solutions that reduce the need for human intervention in retail store management.
翻译:库存跟踪和错放物品的重新整理是零售环境中劳动密集度最高的任务之一。尽管已有尝试使用基于视觉的技术来处理这些任务,但它们大多采用货架图合规性进行异常检测,而该技术已被发现缺乏鲁棒性和可扩展性。此外,现有系统在检测后依赖人工干预执行纠正操作。在本文中,我们提出了Co-AD(一种基于概念的异常检测方法),该方法使用视觉变换器(ViT)能够在无需先验知识库(如货架图)的情况下标记错放物品。它采用自编码器架构,随后在潜在空间中进行异常值检测。在来自RP2K数据集的零售物品异常检测图像集上,Co-AD的峰值成功率达89.90%,而表现最佳的基线(标准ViT自编码器)仅为80.81%。为展示其效用,我们描述了一个机器人移动操作流程,可自主纠正Co-AD标记的异常。本工作最终旨在开发减少零售商店管理中人工干预需求的自主移动机器人解决方案。