Testing on real machines is indispensable for robotic control algorithms. In the context of learning-based algorithms, especially VLA models, demand for large-scale evaluation, i.e. testing a large number of models on a large number of tasks, is becoming increasingly urgent. However, doing this right is highly non-trivial, especially when scalability and reproducibility is taken into account. In this report, we describe our methodology for constructing RoboChallenge, an online evaluation system to test robotic control algorithms, and our survey of recent state-of-the-art VLA models using our initial benchmark Table30.
翻译:对真实机器进行测试对于机器人控制算法而言是不可或缺的。在基于学习的算法,特别是视觉语言动作模型的背景下,对大规模评估的需求——即在大量任务上测试大量模型——正变得日益迫切。然而,正确实施这一点极具挑战性,尤其是在考虑可扩展性和可复现性时。本报告中,我们描述了构建RoboChallenge的方法论,这是一个用于测试机器人控制算法的在线评估系统,并利用我们的初始基准Table30对近期最先进的视觉语言动作模型进行了调研。