To realize effective large-scale, real-world robotic applications, we must evaluate how well our robot policies adapt to changes in environmental conditions. Unfortunately, a majority of studies evaluate robot performance in environments closely resembling or even identical to the training setup. We present THE COLOSSEUM, a novel simulation benchmark, with 20 diverse manipulation tasks, that enables systematical evaluation of models across 12 axes of environmental perturbations. These perturbations include changes in color, texture, and size of objects, table-tops, and backgrounds; we also vary lighting, distractors, and camera pose. Using THE COLOSSEUM, we compare 4 state-of-the-art manipulation models to reveal that their success rate degrades between 30-50% across these perturbation factors. When multiple perturbations are applied in unison, the success rate degrades $\geq$75%. We identify that changing the number of distractor objects, target object color, or lighting conditions are the perturbations that reduce model performance the most. To verify the ecological validity of our results, we show that our results in simulation are correlated ($\bar{R}^2 = 0.614$) to similar perturbations in real-world experiments. We open source code for others to use THE COLOSSEUM, and also release code to 3D print the objects used to replicate the real-world perturbations. Ultimately, we hope that THE COLOSSEUM will serve as a benchmark to identify modeling decisions that systematically improve generalization for manipulation. See https://robot-colosseum.github.io/ for more details.
翻译:为实现大规模、真实世界的机器人应用,必须评估策略在环境条件变化下的适应能力。然而,现有研究大多在训练设置相近乃至完全相同的环境中评估机器人性能。本文提出新型仿真基准平台THE COLOSSEUM,包含20项不同的操作任务,可系统评估模型在12个环境扰动维度下的表现。这些扰动涵盖:物体、桌面和背景的颜色、纹理与尺寸变化;光照条件、干扰物数量以及相机视角的差异。通过THE COLOSSEUM,我们对比了四种先进操作模型,发现其在上述扰动因素下的成功率降低30-50%。当多种扰动同时施加时,成功率降幅达≥75%。分析表明,干扰物数量变化、目标物体颜色变化或光照条件改变是导致模型性能下降最显著的因素。为验证生态效度,我们证明仿真结果与真实实验中的同类扰动存在相关性($\bar{R}^2 = 0.614$)。现已开源平台代码及用于复现真实环境扰动的3D打印物体模型。期待THE COLOSSEUM成为识别系统化提升操作泛化能力的方法基准。详情参见https://robot-colosseum.github.io/。