Flow-Opt: Scalable Centralized Multi-Robot Trajectory Optimization with Flow Matching and Differentiable Optimization

Centralized trajectory optimization in the joint space of multiple robots allows access to a larger feasible space that can result in smoother trajectories, especially while planning in tight spaces. Unfortunately, it is often computationally intractable beyond a very small swarm size. In this paper, we propose Flow-Opt, a learning-based approach towards improving the computational tractability of centralized multi-robot trajectory optimization. Specifically, we reduce the problem to first learning a generative model to sample different candidate trajectories and then using a learned Safety-Filter(SF) to ensure fast inference-time constraint satisfaction. We propose a flow-matching model with a diffusion transformer (DiT) augmented with permutation invariant robot position and map encoders as the generative model. We develop a custom solver for our SF and equip it with a neural network that predicts context-specific initialization. The initialization network is trained in a self-supervised manner, taking advantage of the differentiability of the SF solver. We advance the state-of-the-art in the following respects. First, we show that we can generate trajectories of tens of robots in cluttered environments in a few tens of milliseconds. This is several times faster than existing centralized optimization approaches. Moreover, our approach also generates smoother trajectories orders of magnitude faster than competing baselines based on diffusion models. Second, each component of our approach can be batched, allowing us to solve a few tens of problem instances in a fraction of a second. We believe this is a first such result; no existing approach provides such capabilities. Finally, our approach can generate a diverse set of trajectories between a given set of start and goal locations, which can capture different collision-avoidance behaviors.

翻译：集中式多机器人联合空间轨迹优化能够访问更大的可行空间，从而生成更平滑的轨迹，尤其在狭窄空间规划中更为显著。然而，当机器人集群规模超过极少量时，该方法往往因计算复杂度过高而难以求解。本文提出Flow-Opt——一种基于学习的集中式多机器人轨迹优化可计算性提升方法。具体而言，我们将问题分解为：首先学习一个生成模型以采样不同候选轨迹，随后利用学习得到的Safety-Filter（SF）确保快速推理阶段的约束满足性。我们提出一种流匹配模型，该模型采用扩散Transformer（DiT）架构，并增广了置换不变性机器人位置编码器与地图编码器作为生成模型。针对SF求解器，我们开发了定制化求解器并配备预测上下文相关初始化的神经网络。该初始化网络采用自监督方式训练，利用SF求解器的可微性。本研究在以下方面推进了现有技术水平：第一，我们证明能够在数十毫秒内为密集环境中的数十台机器人生成轨迹，其速度比现有集中式优化方法快数倍；第二，我们的方法生成平滑轨迹的速度比基于扩散模型的竞争基线快数个数量级；第三，方法中各组件均可批量化处理，使我们能在亚秒级时间内解决数十个问题实例——据我们所知，这是首个实现该能力的研究成果，现有方法均不具备此特性；第四，我们的方法能在给定起止位置间生成包含不同碰撞规避行为的多样化轨迹集合。