Spatiotemporal chaos in fluid systems exhibits severe parametric sensitivity, rendering classical adjoint-based optimal control intractable because each operating regime requires recomputing the control law. We address this bottleneck with hyperFastRL, a parameter-conditioned reinforcement learning framework that leverages Hypernetworks to shift from tuning isolated controllers per-regime to learning a unified parametric control manifold. By mapping a physical forcing parameter μ directly to the weights of a spatial feedback policy, the architecture cleanly decouples parametric adaptation from spatial boundary stabilization. To overcome the extreme variance inherent to chaotic reward landscapes, we deploy a pessimistic distributional value estimation over a massively parallel environment ensemble. We evaluate three Hypernetwork functional forms, ranging from residual MLPs to periodic Fourier and Kolmogorov-Arnold (KAN) representations, on the Kuramoto-Sivashinsky equation under varying spatial forcing. All forms achieve robust stabilization. KAN yields the most consistent energy-cascade suppression and tracking across unseen parametrizations, while Fourier networks exhibit worse extrapolation variability. Furthermore, leveraging high-throughput parallelization allows us to intentionally trade a fraction of peak asymptotic reward for a 37% reduction in training wall-clock time, identifying an optimal operating regime for practical deployment in complex, parameter-varying chaotic PDEs.
翻译:流体系统中的时空混沌表现出严重的参数敏感性,这使得基于经典伴随法的最优控制难以实现,因为每个运行工况都需要重新计算控制律。我们通过hyperFastRL——一种参数条件化的强化学习框架——来解决这一瓶颈,该框架利用超网络将工作重心从为每个工况调节独立控制器转向学习统一的参数化控制流形。通过将物理强迫参数μ直接映射到空间反馈策略的权重上,该架构清晰地将参数自适应与空间边界稳定化解耦。为克服混沌奖励景观固有的极端方差,我们在大规模并行环境集成上部署了悲观分布值估计。我们在不同空间强迫下的库拉莫托-西瓦辛斯基方程上评估了三种超网络函数形式——从残差多层感知机到周期傅里叶表示和柯尔莫戈洛夫-阿诺德表示。所有形式均实现了鲁棒的稳定化。其中,KAN在未见过参数化条件下展现出最一致的能量级联抑制与跟踪性能,而傅里叶网络则表现出较差的 extrapolation 变异性。此外,利用高通量并行化使我们能够有意牺牲部分峰值渐近奖励,以换取37%的训练挂钟时间缩减,从而为复杂、参数变化的混沌偏微分方程的实际部署确定了最优运行区间。