We study adversarially robust multitask adaptive linear quadratic control; a setting where multiple systems collaboratively learn control policies under model uncertainty and adversarial corruption. We propose a clustered multitask approach that integrates clustering and system identification with resilient aggregation to mitigate corrupted model updates. Our analysis characterizes how clustering accuracy, intra-cluster heterogeneity, and adversarial behavior affect the expected regret of certainty-equivalent (CE) control across LQR tasks. We establish non-asymptotic bounds demonstrating that the regret decreases inversely with the number of honest systems per cluster and that this reduction is preserved under a bounded fraction of adversarial systems within each cluster.
翻译:我们研究对抗鲁棒的多任务自适应线性二次控制;该场景中多个系统在模型不确定性和对抗性扰动下协作学习控制策略。我们提出一种聚类多任务方法,将聚类与系统辨识结合,并采用鲁棒聚合机制以减轻被污染模型更新的影响。我们的分析刻画了聚类精度、簇内异质性以及对抗行为如何影响线性二次调节任务中确定性等价控制的期望遗憾。我们建立了非渐近界,证明遗憾随每个簇内诚实系统数量的增加而反比下降,且该下降趋势在每个簇内对抗系统比例有界的情况下得以保持。