We present a coverage framework that integrates Hilbert space-filling priors into decentralized multi-robot learning and execution. We augment DQN and PPO with Hilbert-based spatial indices to structure exploration and reduce redundancy in sparse-reward environments, and we evaluate scalability in multi-robot grid coverage. We further describe a waypoint interface that converts Hilbert orderings into curvature-bounded, time-parameterized SE(2) trajectories (planar (x, y, θ)), enabling onboard feasibility on resource-constrained robots. Experiments show improvements in coverage efficiency, redundancy, and convergence speed over DQN/PPO baselines. In addition, we validate the approach on a Boston Dynamics Spot legged robot, executing the generated trajectories in indoor environments and observing reliable coverage with low redundancy. These results indicate that geometric priors improve autonomy and scalability for swarm and legged robotics.
翻译:本文提出一种覆盖框架,将希尔伯特空间填充先验集成到去中心化多机器人学习与执行系统中。我们通过希尔伯特空间索引增强DQN和PPO算法,以结构化探索过程并降低稀疏奖励环境中的冗余度,并在多机器人网格覆盖任务中评估其可扩展性。进一步提出航点接口,将希尔伯特序列转换为曲率有界、时间参数化的SE(2)轨迹(平面(x, y, θ)),使资源受限机器人能够实现机载可行轨迹。实验表明,相较于DQN/PPO基线方法,本方法在覆盖效率、冗余度降低和收敛速度方面均有提升。此外,我们在Boston Dynamics Spot腿式机器人上验证了该方法,通过在室内环境执行生成轨迹,观察到可靠覆盖与低冗余度的特性。这些结果表明几何先验能有效提升集群机器人与腿式机器人系统的自主性与可扩展性。