RADAR: Closed-Loop Robotic Data Generation via Semantic Planning and Autonomous Causal Environment Reset

The acquisition of large-scale physical interaction data, a critical prerequisite for modern robot learning, is severely bottlenecked by the prohibitive cost and scalability limits of human-in-the-loop collection paradigms. To break this barrier, we introduce Robust Autonomous Data Acquisition for Robotics (RADAR), a fully autonomous, closed-loop data generation engine that completely removes human intervention from the collection cycle. RADAR elegantly divides the cognitive load into a four-module pipeline. Anchored by 2-5 3D human demonstrations as geometric priors, a Vision-Language Model first orchestrates scene-relevant task generation via precise semantic object grounding and skill retrieval. Next, a Graph Neural Network policy translates these subtasks into physical actions via in-context imitation learning. Following execution, the VLM performs automated success evaluation using a structured Visual Question Answering pipeline. Finally, to shatter the bottleneck of manual resets, a Finite State Machine orchestrates an autonomous environment reset and asymmetric data routing mechanism. Driven by simultaneous forward-reverse planning with a strict Last-In, First-Out causal sequence, the system seamlessly restores unstructured workspaces and robustly recovers from execution failures. This continuous brain-cerebellum synergy transforms data collection into a self-sustaining process. Extensive evaluations highlight RADAR's exceptional versatility. In simulation, our framework achieves up to 90% success rates on complex, long-horizon tasks, effortlessly solving challenges where traditional baselines plummet to near-zero performance. In real-world deployments, the system reliably executes diverse, contact-rich skills (e.g., deformable object manipulation) via few-shot adaptation without domain-specific fine-tuning, providing a highly scalable paradigm for robotic data acquisition.

翻译：大规模物理交互数据的获取作为现代机器人学习的关键前提，正受到人工参与式采集范式的高昂成本与可扩展性限制的严重制约。为突破此瓶颈，我们提出机器人鲁棒自主数据采集系统（RADAR），这是一种完全自主的闭环数据生成引擎，彻底消除了采集周期中的人工干预。RADAR通过四模块流水线优雅地分解认知负荷：以2-5个三维人类示教作为几何先验，视觉语言模型首先通过精确的语义物体定位与技能检索，生成与场景相关的任务序列；随后，图神经网络策略通过上下文模仿学习将这些子任务转化为物理动作；执行完成后，视觉语言模型通过结构化视觉问答流程进行自动化成功评估；最后，为突破人工重置的瓶颈，有限状态机协调自主环境重置与非对称数据路由机制。该系统通过严格遵循后进先出因果序列的正反向同步规划，能无缝恢复非结构化工作空间，并从执行失败中稳健恢复。这种持续的大脑-小脑协同将数据采集转化为自维持过程。大量实验凸显了RADAR的卓越泛化能力：在仿真环境中，本框架在复杂长周期任务上达成高达90%的成功率，轻松解决传统基线方法性能趋近于零的挑战；在实际部署中，系统通过少量样本适应即可可靠执行多样化的接触密集型技能（如可变形物体操控），且无需领域特定微调，为机器人数据采集提供了高度可扩展的范式。