Autonomous systems face the intricate challenge of navigating unpredictable environments and interacting with external objects. The successful integration of robotic agents into real-world situations hinges on their perception capabilities, which involve amalgamating world models and predictive skills. Effective perception models build upon the fusion of various sensory modalities to probe the surroundings. Deep learning applied to raw sensory modalities offers a viable option. However, learning-based perceptive representations become difficult to interpret. This challenge is particularly pronounced in soft robots, where the compliance of structures and materials makes prediction even harder. Our work addresses this complexity by harnessing a generative model to construct a multi-modal perception model for soft robots and to leverage proprioceptive and visual information to anticipate and interpret contact interactions with external objects. A suite of tools to interpret the perception model is furnished, shedding light on the fusion and prediction processes across multiple sensory inputs after the learning phase. We will delve into the outlooks of the perception model and its implications for control purposes.
翻译:自主系统面临着在不可预测环境中导航并与外部物体交互的复杂挑战。机器人智能体在现实场景中的成功应用,关键在于其感知能力——这需要融合世界模型与预测技能。有效的感知模型建立在多种感觉模态融合以探测环境的基础上。将深度学习应用于原始感觉模态提供了一种可行方案。然而,基于学习的感知表征往往难以解释。这一挑战在软体机器人中尤为突出,其结构与材料的柔顺性使得预测更加困难。本研究通过利用生成模型构建软体机器人的多模态感知模型,并整合本体感觉与视觉信息来预测和解释与外部物体的接触交互,以应对这一复杂性。我们提供了一套解释感知模型的工具,揭示了学习阶段后多感官输入的融合与预测过程。我们将深入探讨该感知模型的前景及其在控制应用中的意义。