We investigate policy transfer using image-to-semantics translation to mitigate learning difficulties in vision-based robotics control agents. This problem assumes two environments: a simulator environment with semantics, that is, low-dimensional and essential information, as the state space, and a real-world environment with images as the state space. By learning mapping from images to semantics, we can transfer a policy, pre-trained in the simulator, to the real world, thereby eliminating real-world on-policy agent interactions to learn, which are costly and risky. In addition, using image-to-semantics mapping is advantageous in terms of the computational efficiency to train the policy and the interpretability of the obtained policy over other types of sim-to-real transfer strategies. To tackle the main difficulty in learning image-to-semantics mapping, namely the human annotation cost for producing a training dataset, we propose two techniques: pair augmentation with the transition function in the simulator environment and active learning. We observed a reduction in the annotation cost without a decline in the performance of the transfer, and the proposed approach outperformed the existing approach without annotation.
翻译:我们研究利用图像到语义翻译来缓解基于视觉的机器人控制代理的学习困难,从而实现策略迁移。该问题假设两种环境:一种是以语义(即低维且关键信息)作为状态空间的模拟器环境,另一种是以图像作为状态空间的真实环境。通过学习从图像到语义的映射,我们可以将在模拟器中预训练的策略迁移到真实世界,从而消除真实世界中代价高昂且风险巨大的在线策略代理交互学习过程。此外,与其他类型的模拟到真实迁移策略相比,使用图像到语义映射在策略训练的计算效率和所得策略的可解释性方面具有优势。为解决图像到语义映射学习中的主要困难——即生成训练数据集的人力标注成本——我们提出了两种技术:利用模拟器环境中的转移函数进行配对增强,以及主动学习。我们观察到,在迁移性能不下降的情况下,标注成本得以降低,且所提方法优于无标注的现有方法。