Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. Toward this goal, we extend our previous work to propose the TOPS2 descriptor, and an accompanying recognition framework, THOR2, inspired by a human reasoning mechanism known as object unity. We interleave color embeddings obtained using the Mapper algorithm for topological soft clustering with the shape-based TOPS descriptor to obtain the TOPS2 descriptor. THOR2, trained using synthetic data, achieves substantially higher recognition accuracy than the shape-based THOR framework and outperforms RGB-D ViT on two real-world datasets: the benchmark OCID dataset and the UW-IS Occluded dataset. Therefore, THOR2 is a promising step toward achieving robust recognition in low-cost robots.
翻译:在未知且杂乱的室内环境中进行视觉目标识别是移动机器人面临的一项挑战性挑战。为此,我们拓展了前期工作,提出了TOPS2描述子及其配套识别框架THOR2,该框架受人类推理机制“对象统一性”启发。通过将Mapper算法(用于拓扑软聚类)获取的颜色嵌入与基于形状的TOPS描述子相结合,我们得到了TOPS2描述子。基于合成数据训练的THOR2在识别精度上显著优于基于形状的THOR框架,并在两个真实世界数据集(基准OCID数据集和UW-IS遮挡数据集)上超越了RGB-D ViT模型。因此,THOR2是迈向低成本机器人鲁棒识别的重要一步。