ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic Manipulation

from arxiv, 13 pages (including references and appendix), 12 figures. Accepted to ICAPS 2026. Code available at https://github.com/Xuerui-Wang-oss/Adaptive-Curriculum-Learning-and-Dynamic-Contrastive-Control

Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.

翻译：目标条件强化学习在机器人操作领域展现出巨大潜力；然而，现有方法仍受限于其对收集经验优先排序的依赖，导致其在多样化任务中的性能表现欠佳。受人类学习行为启发，我们提出了一种更全面的学习范式ACDC，该范式将多维自适应课程（AC）规划与动态对比（DC）控制相结合，以引导智能体沿着精心设计的学习轨迹前进。具体而言，在规划层面，AC组件根据智能体的成功率和训练进度，动态平衡多样性驱动的探索与质量驱动的利用，从而调度学习课程。在控制层面，DC组件通过范数约束的对比学习来实施课程计划，实现与当前课程重点对齐的幅度引导经验选择。在具有挑战性的机器人操作任务上进行的大量实验表明，ACDC在样本效率和最终任务成功率方面均持续优于最先进的基线方法。

相关内容

课程

关注 6

课程是指学校学生所应学习的学科总和及其进程与安排。课程是对教育的目标、教学内容、教学活动方式的规划和设计，是教学计划、教学大纲等诸多方面实施过程的总和。广义的课程是指学校为实现培养目标而选择的教育内容及其进程的总和，它包括学校老师所教授的各门学科和有目的、有计划的教育活动。狭义的课程是指某一门学科。专知上对国内外最新AI+X的课程进行了收集与索引，涵盖斯坦福大学、CMU、MIT、清华、北大等名校开放课程。

《机器人强化学习技术进展》34页

专知会员服务

39+阅读 · 2025年7月16日

【剑桥博士论文】面向多机器人系统的学习型协同感知与控制

专知会员服务

25+阅读 · 2025年3月26日

面向机器人系统的虚实迁移强化学习综述

专知会员服务

44+阅读 · 2024年2月8日

机器人如何用机器学习？斯威本科大等《机器学习遇上机器人操控》综述

专知会员服务

48+阅读 · 2023年9月27日