Self-Supervised Learning of Visual Robot Localization Using LED State Prediction as a Pretext Task

We propose a novel self-supervised approach for learning to visually localize robots equipped with controllable LEDs. We rely on a few training samples labeled with position ground truth and many training samples in which only the LED state is known, whose collection is cheap. We show that using LED state prediction as a pretext task significantly helps to learn the visual localization end task. The resulting model does not require knowledge of LED states during inference. We instantiate the approach to visual relative localization of nano-quadrotors: experimental results show that using our pretext task significantly improves localization accuracy (from 68.3% to 76.2%) and outperforms alternative strategies, such as a supervised baseline, model pre-training, and an autoencoding pretext task. We deploy our model aboard a 27-g Crazyflie nano-drone, running at 21 fps, in a position-tracking task of a peer nano-drone. Our approach, relying on position labels for only 300 images, yields a mean tracking error of 4.2 cm versus 11.9 cm of a supervised baseline model trained without our pretext task. Videos and code of the proposed approach are available at https://github.com/idsia-robotics/leds-as-pretext

翻译：我们提出了一种新颖的自监督方法，用于学习配备可控LED的机器人视觉定位。该方法仅依赖少量带有真实位置标注的训练样本，以及大量仅需知晓LED状态（收集成本低廉）的训练样本。研究表明，将LED状态预测作为前置任务能显著提升视觉定位最终任务的学习效果，且推理阶段无需LED状态信息。我们将该方法应用于纳米四旋翼飞行器的视觉相对定位：实验结果表明，采用该前置任务使定位准确率从68.3%提升至76.2%，性能优于监督基线模型、模型预训练及自编码前置任务等替代策略。我们在27克重的Crazyflie纳米无人机上部署了该模型，以21帧/秒的帧率执行同伴纳米无人机的位置追踪任务。该方法仅依赖300张带位置标注的图像，平均追踪误差仅为4.2厘米，而未经前置任务训练的监督基线模型误差达11.9厘米。论文视频及代码开源地址：https://github.com/idsia-robotics/leds-as-pretext

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日