CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications

Spatial understanding from vision is crucial for robots operating in unstructured environments. In the real world, spatial understanding is often an ill-posed problem. There are a number of powerful classical methods that accurately regress relative pose, however, these approaches often lack the ability to leverage data-derived priors to resolve ambiguities. In multi-robot systems, these challenges are exacerbated by the need for accurate and frequent position estimates of cooperating agents. To this end, we propose CoViS-Net, a cooperative, multi-robot, visual spatial foundation model that learns spatial priors from data. Unlike prior work evaluated primarily on offline datasets, we design our model specifically for online evaluation and real-world deployment on cooperative robots. Our model is completely decentralized, platform agnostic, executable in real-time using onboard compute, and does not require existing network infrastructure. In this work, we focus on relative pose estimation and local Bird's Eye View (BEV) prediction tasks. Unlike classical approaches, we show that our model can accurately predict relative poses without requiring camera overlap, and predict BEVs of regions not visible to the ego-agent. We demonstrate our model on a multi-robot formation control task outside the confines of the laboratory.

翻译：视觉空间理解对于在非结构化环境中运行的机器人至关重要。在现实世界中，空间理解往往是一个不适定问题。现有多种强大的经典方法能够准确回归相对位姿，但此类方法通常缺乏利用数据先验来解决歧义性的能力。在多机器人系统中，由于需要对协作智能体进行精确且高频的位置估计，这些挑战进一步加剧。为此，我们提出CoViS-Net——一种协作式多机器人视觉空间基础模型，能够从数据中学习空间先验。与主要基于离线数据集进行评估的先前工作不同，我们专门针对在线评估和在协作机器人上的实际部署设计了该模型。我们的模型完全去中心化、与平台无关、可利用机载计算实现实时执行，且无需依赖现有网络基础设施。本文重点研究相对位姿估计和局部鸟瞰图预测任务。与经典方法不同，我们证明了该模型无需相机重叠即可准确预测相对位姿，并能预测自主智能体视野以外区域的鸟瞰图。我们还在实验室环境之外的多机器人编队控制任务上验证了该模型。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日