Parallel trajectory optimization via the Alternating Direction Method of Multipliers (ADMM) has emerged as a scalable approach to long-horizon motion planning. However, existing frameworks typically decompose the problem into parallel subproblems based on a predefined fixed structure. Such structural rigidity often causes optimization stagnation in highly constrained regions, where a few lagging subproblems delay global convergence. A natural remedy is to adaptively re-split these stagnating segments online. Yet, deciding when, where, and how to split exceeds the capability of rule-based heuristics. To this end, we propose ATRS, a novel framework that embeds a shared Deep Reinforcement Learning policy into the parallel ADMM loop. We formulate this adaptive adjustment as a Multi-Agent Shared-Policy Markov Decision Process, where all trajectory segments act as homogeneous agents and share a unified neural policy network. This parameter-sharing architecture endows the system with size invariance, enabling it to handle dynamically changing segment counts during re-splitting and generalize to arbitrary trajectory lengths. Furthermore, our formulation inherently supports zero-shot generalization to unseen environments, as our network relies solely on the internal states of the numerical solver rather than on the geometric features of the environment. To ensure solver stability, a Confidence-Based Election mechanism selects only the most stagnating segment for re-splitting at each step. Extensive simulations demonstrate that ATRS accelerates convergence, reducing the number of iterations by up to 26.0% and the computation time by up to 19.1%. Real-world experiments further confirm its applicability to both large-scale offline global planning and real-time onboard replanning within 35 ms per cycle, with no sim-to-real degradation.
翻译:通过交替方向乘子法(ADMM)进行并行轨迹优化已成为解决长时域运动规划问题的一种可扩展方法。然而,现有框架通常基于预定义的固定结构将问题分解为并行子问题。这种结构性刚化常导致高度约束区域中的优化停滞——少数滞后子问题会延缓全局收敛。一种自然的补救措施是自适应地在线重分割这些停滞段。然而,决定何时、何处以及如何分割已超出基于规则的启发式方法的能力。为此,我们提出ATRS,一种将共享深度强化学习策略嵌入并行ADMM循环的新型框架。我们将这种自适应调整建模为多智能体共享策略马尔可夫决策过程,其中所有轨迹段均作为同构智能体共享统一的神经策略网络。这种参数共享架构赋予系统尺度不变性,使其能在重分割过程中处理动态变化的段数,并泛化至任意轨迹长度。此外,我们的公式化方法天然支持对未知环境的零样本泛化——因为网络仅依赖数值求解器的内部状态而非环境几何特征。为保证求解器稳定性,基于置信度的选举机制在每步中仅选择最滞后的段进行重分割。大量仿真结果表明,ATRS加速了收敛,使迭代次数减少高达26.0%,计算时间缩短高达19.1%。真实世界实验进一步验证了其在每周期35毫秒内完成大规模离线全局规划与实时在线重规划的适用性,且未出现仿真到现实的性能退化。