Multi-agent embodied tasks have recently been studied in complex indoor visual environments. Collaboration among multiple agents can improve work efficiency and has significant practical value. However, most of the existing research focuses on homogeneous multi-agent tasks. Compared with homogeneous agents, heterogeneous agents can leverage their different capabilities to allocate corresponding sub-tasks and cooperate to complete complex tasks. Heterogeneous multi-agent tasks are common in real-world scenarios, and the collaboration strategy among heterogeneous agents is a challenging and important problem to be solved. To study collaboration among heterogeneous agents, we propose the heterogeneous multi-agent tidying-up task, in which multiple heterogeneous agents with different capabilities collaborate with each other to detect misplaced objects and place them in reasonable locations. This is a demanding task since it requires agents to make the best use of their different capabilities to conduct reasonable task planning and complete the whole task. To solve this task, we build a heterogeneous multi-agent tidying-up benchmark dataset in a large number of houses with multiple rooms based on ProcTHOR-10K. We propose the hierarchical decision model based on misplaced object detection, reasonable receptacle prediction, as well as the handshake-based group communication mechanism. Extensive experiments are conducted to demonstrate the effectiveness of the proposed model. The project's website and videos of experiments can be found at https://hetercol.github.io/.
翻译:多智能体具身任务近年来在复杂室内视觉环境中得到了研究。多个智能体之间的协作可以提高工作效率并具有重要的实用价值。然而,现有研究大多关注同构多智能体任务。与同构智能体相比,异构智能体能够利用其不同能力分配相应的子任务并协作完成复杂任务。异构多智能体任务在现实场景中普遍存在,异构智能体之间的协作策略是一个具有挑战性且亟待解决的重要问题。为了研究异构智能体之间的协作,我们提出了异构多智能体整理任务,在该任务中,多个具有不同能力的异构智能体相互协作,检测放错位置的物体并将其放置到合理的位置。这是一项要求很高的任务,因为它要求智能体充分利用其不同能力进行合理的任务规划并完成整个任务。为解决此任务,我们基于ProcTHOR-10K在大量包含多个房间的房屋中构建了一个异构多智能体整理基准数据集。我们提出了基于错位物体检测、合理接收器预测以及基于握手的群体通信机制的分层决策模型。进行了大量实验以证明所提模型的有效性。项目网站和实验视频可在https://hetercol.github.io/找到。