We study whether optimal state-feedback laws for a family of heterogeneous Multiple-Input, Multiple-Output (MIMO) Linear Time-Invariant (LTI) systems can be captured by a single learned controller. We train one transformer policy on LQR-generated trajectories from systems with different state and input dimensions, using a shared representation with standardization, padding, dimension encoding, and masked loss. The policy maps recent state history to control actions without requiring plant matrices at inference time. Across a broad set of systems, it achieves empirically small sub-optimality relative to Linear Quadratic Regulator (LQR), remains stabilizing under moderate parameter perturbations, and benefits from lightweight fine-tuning on unseen systems. These results support transformer policies as practical approximators of near-optimal feedback laws over structured linear-system families.
翻译:本文研究针对一类异构多输入多输出线性时不变系统的最优状态反馈律,能否通过单一学习型控制器实现。我们基于不同状态与输入维度系统的LQR生成轨迹,采用共享表示结合标准化、填充、维度编码与掩码损失的方法,训练了一个Transformer策略。该策略在无需推理时获取被控对象矩阵的情况下,能够将近期状态历史映射为控制动作。在广泛的系统测试中,该策略相对于线性二次调节器实现了经验性的微小次优性,在适度参数扰动下仍能保持稳定,并能通过对未见系统进行轻量级微调获得性能提升。这些结果表明Transformer策略可作为结构化线性系统族上近似最优反馈律的实用逼近器。