Cross-Platform Learnable Fuzzy Gain-Scheduled Proportional-Integral-Derivative Controller Tuning via Physics-Constrained Meta-Learning and Reinforcement Learning Adaptation

翻译：跨平台可学习模糊增益调度比例-积分-微分控制器调参：基于物理约束元学习与强化学习自适应

JiaHao Wu,ShengWen Yu

from arxiv, 24 pages,15 tables, 6 figures

Motivation and gap: PID-family controllers remain a pragmatic choice for many robotic systems due to their simplicity and interpretability, but tuning stable, high-performing gains is time-consuming and typically non-transferable across robot morphologies, payloads, and deployment conditions. Fuzzy gain scheduling can provide interpretable online adjustment, yet its per-joint scaling and consequent parameters are platform-dependent and difficult to tune systematically. Proposed approach: We propose a hierarchical framework for cross-platform tuning of a learnable fuzzy gain-scheduled PID (LF-PID). The controller uses shared fuzzy membership partitions to preserve common error semantics, while learning per-joint scaling and Takagi-Sugeno consequent parameters that schedule PID gains online. Combined with physics-constrained virtual robot synthesis, meta-learning provides cross-platform initialization from robot physical features, and a lightweight reinforcement learning (RL) stage performs deployment-specific refinement under dynamics mismatch. Starting from three base simulated platforms, we generate 232 physically valid training variants via bounded perturbations of mass (+/-10%), inertia (+/-15%), and friction (+/-20%). Results and insight: We evaluate cross-platform generalization on two distinct systems (a 9-DOF serial manipulator and a 12-DOF quadruped) under multiple disturbance scenarios. The RL adaptation stage improves tracking performance on top of the meta-initialized controller, with up to 80.4% error reduction in challenging high-load joints (12.36 degrees to 2.42 degrees) and 19.2% improvement under parameter uncertainty. We further identify an optimization ceiling effect: online refinement yields substantial gains when the meta-initialized baseline exhibits localized deficiencies, but provides limited improvement when baseline quality is already uniformly strong.

翻译：研究动机与空白：PID族控制器因其简洁性与可解释性仍是多数机器人系统的实用选择，但稳定高性能增益的调参过程耗时且通常无法跨机器人构型、负载及部署条件迁移。模糊增益调度虽能提供可解释的在线调节，但其关节级缩放参数与后件参数具有平台依赖性且难以系统化调优。研究方法：本文提出一种分层框架用于可学习模糊增益调度PID控制器的跨平台调参。该控制器通过共享模糊隶属度分区保持通用误差语义，同时学习用于在线调度PID增益的关节级缩放参数与Takagi-Sugeno后件参数。结合物理约束的虚拟机器人合成技术，元学习根据机器人物理特征提供跨平台初始化，轻量级强化学习阶段则在动力学失配条件下执行部署场景的精细化调整。基于三种基础仿真平台，通过对质量（±10%）、惯量（±15%）和摩擦系数（±20%）进行有界扰动，生成232个物理有效的训练变体。研究结果与启示：在多种扰动场景下，我们在两个独立系统（9自由度串联机械臂与12自由度四足机器人）上评估跨平台泛化性能。强化学习自适应阶段在元初始化控制器基础上提升了轨迹跟踪性能：高负载关节的跟踪误差最高降低80.4%（从12.36度降至2.42度），参数不确定条件下的性能提升达19.2%。进一步研究发现优化天花板效应：当元初始化基线存在局部缺陷时，在线精细化能带来显著增益；而当基线质量已整体优良时，其改进效果有限。