Multi-agent embodied tasks have recently been studied in complex indoor visual environments. Collaboration among multiple agents can improve work efficiency and has significant practical value. However, most of the existing research focuses on homogeneous multi-agent tasks. Compared with homogeneous agents, heterogeneous agents can leverage their different capabilities to allocate corresponding sub-tasks and cooperate to complete complex tasks. Heterogeneous multi-agent tasks are common in real-world scenarios, and the collaboration strategy among heterogeneous agents is a challenging and important problem to be solved. To study collaboration among heterogeneous agents, we propose the heterogeneous multi-agent tidying-up task, in which multiple heterogeneous agents with different capabilities collaborate with each other to detect misplaced objects and place them in reasonable locations. This is a demanding task since it requires agents to make the best use of their different capabilities to conduct reasonable task planning and complete the whole task. To solve this task, we build a heterogeneous multi-agent tidying-up benchmark dataset in a large number of houses with multiple rooms based on ProcTHOR-10K. We propose the hierarchical decision model based on misplaced object detection, reasonable receptacle prediction, as well as the handshake-based group communication mechanism. Extensive experiments are conducted to demonstrate the effectiveness of the proposed model. The project's website and videos of experiments can be found at https://hetercol.github.io/.
翻译:多智能体具身任务近期在复杂室内视觉环境中得到广泛研究。多个智能体之间的协作能够提升工作效率,具有重要的实际应用价值。然而,现有研究主要聚焦于同构多智能体任务。与同构智能体相比,异构智能体可以利用各自不同的能力分配相应子任务,并协作完成复杂任务。异构多智能体任务在现实场景中普遍存在,且其中异构智能体间的协作策略是一个具有挑战性的重要问题。为研究异构智能体间的协作机制,我们提出了异构多智能体整理任务:多个具备不同能力的异构智能体相互协作,检测错放的物品并将其放置在合理位置。该任务要求智能体充分利用各自能力进行合理任务规划并完成整体任务,具有较高难度。为解决该任务,我们基于ProcTHOR-10K在包含多房间的大量房屋中构建了异构多智能体整理基准数据集。基于错放物品检测、合理收纳位置预测以及握手式群体通信机制,我们提出了层级决策模型。通过大量实验验证了所提模型的有效性。项目网站及实验视频请访问https://hetercol.github.io/。