Building a low-latency humanoid teleoperation system is essential for collecting diverse reactive and dynamic demonstrations. However, existing approaches rely on heavily pre-processed human-to-humanoid motion retargeting and position-only PD control, resulting in substantial latency that severely limits responsiveness and prevents tasks requiring rapid feedback and fast reactions. To address this problem, we propose ExtremControl, a low latency whole-body control framework that: (1) operates directly on SE(3) poses of selected rigid links, primarily humanoid extremities, to avoid full-body retargeting; (2) utilizes a Cartesian-space mapping to directly convert human motion to humanoid link targets; and (3) incorporates velocity feedforward control at low level to support highly responsive behavior under rapidly changing control interfaces. We further provide a unified theoretical formulation of ExtremControl and systematically validate its effectiveness through experiments in both simulation and real-world environments. Building on ExtremControl, we implement a low-latency humanoid teleoperation system that supports both optical motion capture and VR-based motion tracking, achieving end-to-end latency as low as 50ms and enabling highly responsive behaviors such as ping-pong ball balancing, juggling, and real-time return, thereby substantially surpassing the 200ms latency limit observed in prior work.
翻译:构建低延迟的人形机器人遥操作系统对于收集多样化的反应式与动态演示数据至关重要。然而,现有方法依赖于经过大量预处理的人-人形机器人运动重定向以及仅基于位置的PD控制,导致显著的延迟,严重限制了系统的响应能力,并阻碍了需要快速反馈与即时反应的任务执行。为解决此问题,我们提出极控,一种低延迟全身控制框架,其特点在于:(1) 直接对选定刚性连杆(主要是人形机器人的末端肢体)的SE(3)位姿进行操作,以避免全身重定向;(2) 利用笛卡尔空间映射,直接将人体运动转换为人形机器人连杆目标;(3) 在底层引入速度前馈控制,以支持在快速变化的控制接口下实现高响应行为。我们进一步给出了极控的统一理论表述,并通过仿真与真实环境实验系统地验证了其有效性。基于极控,我们实现了一个低延迟的人形机器人遥操作系统,支持光学动作捕捉与基于VR的运动追踪,实现了低至50毫秒的端到端延迟,并能够执行诸如乒乓球平衡、杂耍抛接与实时回击等高响应性行为,从而显著超越了先前工作中观察到的200毫秒延迟限制。