Imitation learning enables robots to learn and replicate human behavior from training data. Recent advances in machine learning enable end-to-end learning approaches that directly process high-dimensional observation data, such as images. However, these approaches face a critical challenge when processing data from multiple modalities, inadvertently ignoring data with a lower correlation to the desired output, especially when using short sampling periods. This paper presents a useful method to address this challenge, which amplifies the influence of data with a relatively low correlation to the output by inputting the data into each neural network layer. The proposed approach effectively incorporates diverse data sources into the learning process. Through experiments using a simple pick-and-place operation with raw images and joint information as input, significant improvements in success rates are demonstrated even when dealing with data from short sampling periods.
翻译:模仿学习使机器人能够从训练数据中学习并复制人类行为。机器学习的最新进展推动了端到端学习方法的发展,可直接处理高维观测数据(如图像)。然而,这些方法在处理多模态数据时面临关键挑战:尤其在采用短采样周期的情况下,会无意中忽略与期望输出相关性较低的数据。本文提出一种有效的应对方法,通过将数据输入至神经网络的每一层来放大与输出相关性较低的数据的影响力。所提出的方法能有效将多种数据源整合到学习过程中。通过以原始图像和关节信息作为输入的简单拾放操作实验,即使处理短采样周期的数据,成功率也得到显著提升。