ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes

Ran Gong,Jiangyong Huang,Yizhou Zhao,Haoran Geng,Xiaofeng Gao,Qingyang Wu,Wensi Ai,Ziheng Zhou,Demetri Terzopoulos,Song-Chun Zhu,Baoxiong Jia,Siyuan Huang

from arxiv, The first two authors contributed equally; 20 pages; 17 figures; project availalbe: https://arnold-benchmark.github.io/ ICCV 2023

Understanding the continuous states of objects is essential for task learning and planning in the real world. However, most existing task learning benchmarks assume discrete (e.g., binary) object goal states, which poses challenges for the learning of complex tasks and transferring learned policy from simulated environments to the real world. Furthermore, state discretization limits a robot's ability to follow human instructions based on the grounding of actions and states. To tackle these challenges, we present ARNOLD, a benchmark that evaluates language-grounded task learning with continuous states in realistic 3D scenes. ARNOLD is comprised of 8 language-conditioned tasks that involve understanding object states and learning policies for continuous goals. To promote language-instructed learning, we provide expert demonstrations with template-generated language descriptions. We assess task performance by utilizing the latest language-conditioned policy learning models. Our results indicate that current models for language-conditioned manipulations continue to experience significant challenges in novel goal-state generalizations, scene generalizations, and object generalizations. These findings highlight the need to develop new algorithms that address this gap and underscore the potential for further research in this area. Project website: https://arnold-benchmark.github.io.

翻译：理解物体的连续状态对于现实世界中的任务学习与规划至关重要。然而，现有的大多数任务学习基准假设物体目标状态是离散的（例如二元状态），这给复杂任务的学习以及将习得策略从仿真环境迁移到真实世界带来了挑战。此外，状态离散化限制了机器人基于动作与状态理解执行人类指令的能力。为解决这些问题，我们提出ARNOLD，这是一个在真实3D场景中评估连续状态下语言引导任务学习的基准。ARNOLD包含8个语言条件任务，涉及理解物体状态并学习面向连续目标的策略。为促进语言指令学习，我们提供了基于模板生成语言描述的专家演示。我们通过使用最新的语言条件策略学习模型来评估任务性能。结果表明，当前的语言条件操作模型在新颖目标状态泛化、场景泛化和物体泛化方面仍面临显著挑战。这些发现凸显了开发新算法以弥合这一差距的必要性，并揭示了该领域进一步研究的潜力。项目网站：https://arnold-benchmark.github.io。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日