Performing tasks in a physical environment is a crucial yet challenging problem for AI systems operating in the real world. Physics simulation-based tasks are often employed to facilitate research that addresses this challenge. In this paper, first, we present a systematic approach for defining a physical scenario using a causal sequence of physical interactions between objects. Then, we propose a methodology for generating tasks in a physics-simulating environment using these defined scenarios as inputs. Our approach enables a better understanding of the granular mechanics required for solving physics-based tasks, thereby facilitating accurate evaluation of AI systems' physical reasoning capabilities. We demonstrate our proposed task generation methodology using the physics-based puzzle game Angry Birds and evaluate the generated tasks using a range of metrics, including physical stability, solvability using intended physical interactions, and accidental solvability using unintended solutions. We believe that the tasks generated using our proposed methodology can facilitate a nuanced evaluation of physical reasoning agents, thus paving the way for the development of agents for more sophisticated real-world applications.
翻译:在物理环境中执行任务是现实世界人工智能系统面临的关键挑战。基于物理模拟的任务常被用于促进该挑战的研究。本文首先提出一种利用物体间物理交互因果序列定义物理场景的系统化方法,继而提出一种以这些定义为输入、在物理模拟环境中生成任务的方法论。本方法有助于深入理解解决物理任务所需的精细力学机制,从而实现对人工智能系统物理推理能力的精准评估。我们以物理益智游戏《愤怒的小鸟》为案例验证所提出的任务生成方法,并通过物理稳定性、预期物理交互的可解性及非预期解法的意外可解性等多维指标评估生成任务。我们认为,运用本方法生成的任务能够支持对物理推理智能体进行细致评估,从而为开发面向复杂现实应用的智能体铺平道路。