This study explores a learning-based tri-finger robotic arm manipulating task, which requires complex movements and coordination among the fingers. By employing reinforcement learning, we train an agent to acquire the necessary skills for proficient manipulation. To enhance the efficiency and effectiveness of the learning process, two knowledge transfer strategies, fine-tuning and curriculum learning, were utilized within the soft actor-critic architecture. Fine-tuning allows the agent to leverage pre-trained knowledge and adapt it to new tasks. Several variations like model transfer, policy transfer, and across-task transfer were implemented and evaluated. To eliminate the need for pretraining, curriculum learning decomposes the advanced task into simpler, progressive stages, mirroring how humans learn. The number of learning stages, the context of the sub-tasks, and the transition timing were found to be the critical design parameters. The key factors of two learning strategies and corresponding effects were explored in context-aware and context-unaware scenarios, enabling us to identify the scenarios where the methods demonstrate optimal performance, derive conclusive insights, and contribute to a broader range of learning-based engineering applications.
翻译:本研究探索了一种基于学习的三指机械臂操控任务,该任务要求手指间实现复杂运动与协调配合。通过采用强化学习方法,我们训练智能体获取完成精准操控所需的技能。为提升学习过程的效率与效果,在软演员-评论家架构中应用了两种知识迁移策略——微调与课程学习。微调使智能体能够利用预训练知识并适应新任务,我们实现并评估了模型迁移、策略迁移及跨任务迁移等多种变体。为消除预训练需求,课程学习将高级任务分解为渐进式的简单阶段,模拟人类的学习方式。研究发现学习阶段数量、子任务情境与转换时机是关键的参数设计要素。本研究在情境感知与非情境感知场景中探讨了两种学习策略的核心要素及其相应效果,从而识别出方法展现最优性能的场景,推导出确凿结论,并为更广泛的基于学习的工程应用提供贡献。