While recent advances in artificial intelligence have achieved human-level performance in environments like Starcraft and Go, many physical reasoning tasks remain challenging for modern algorithms. To date, few algorithms have been evaluated on physical tasks that involve manipulating objects when movable obstacles are present and when tools must be used to perform the manipulation. To promote research on such tasks, we introduce PushWorld, an environment with simplistic physics that requires manipulation planning with both movable obstacles and tools. We provide a benchmark of more than 200 PushWorld puzzles in PDDL and in an OpenAI Gym environment. We evaluate state-of-the-art classical planning and reinforcement learning algorithms on this benchmark, and we find that these baseline results are below human-level performance. We then provide a new classical planning heuristic that solves the most puzzles among the baselines, and although it is 40 times faster than the best baseline planner, it remains below human-level performance.
翻译:尽管人工智能在星际争霸和围棋等环境中已取得与人类相当的表现,但许多物理推理任务对现代算法仍具挑战性。至今,鲜有算法在涉及可移动障碍物且需使用工具执行操控的物理任务中得到评估。为促进此类任务的研究,我们提出了PushWorld——一个具有简化物理特性的环境,要求同时针对可移动障碍物和工具进行操控规划。我们提供了超过200个PushWorld谜题的基准测试,包含PDDL格式及OpenAI Gym环境。我们在此基准上评估了最先进的经典规划与强化学习算法,发现这些基线结果低于人类水平。随后提出一种新的经典规划启发式方法,该解法在基线算法中解决了最多谜题,尽管其速度比最优基线规划器快40倍,但仍未达到人类水平。