We propose a finite-state, decentralized decision and control framework for multi-agent ground coverage. The approach decomposes the problem into two coupled components: (i) the structural design of a deep neural network (DNN) induced by the reference configuration of the agents, and (ii) policy-based decentralized coverage control. Agents are classified as anchors and followers, yielding a generic and scalable communication architecture in which each follower interacts with exactly three in-neighbors from the preceding layer, forming an enclosing triangular communication structure. The DNN training weights implicitly encode the spatial configuration of the agent team, thereby providing a geometric representation of the environmental target set. Within this architecture, we formulate a computationally efficient decentralized Markov decision process (MDP) whose components are time-invariant except for a time-varying cost function defined by the deviation from the centroid of the target set contained within each agent communication triangle. By introducing the concept of Anyway Output Controllability (AOC), we assume each agent is AOC and establish decentralized convergence to a desired configuration that optimally represents the environmental target.
翻译:我们提出了一种有限状态、分散式决策与控制框架,用于多智能体地面覆盖任务。该方法将问题分解为两个耦合部分:(i) 由智能体参考构型诱导的深度神经网络结构设计;(ii) 基于策略的分散式覆盖控制。智能体被划分为锚点智能体与跟随智能体,形成通用且可扩展的通信架构,其中每个跟随智能体仅与来自前一层的三个入邻居交互,构成封闭的三角形通信结构。深度神经网络的训练权重隐式编码了智能体团队的空间构型,从而提供了环境目标集的几何表示。在此架构中,我们构建了计算高效的分散式马尔可夫决策过程,其各组成部分除时变成本函数外均为时不变,该成本函数由每个智能体通信三角形内包含的目标集质心偏差定义。通过引入"任意输出可控性"概念,我们假设每个智能体均满足AOC条件,并证明了系统能够分散收敛到最优表示环境目标的期望构型。