Continuous control tasks often involve high-dimensional, dynamic, and non-linear environments. State-of-the-art performance in these tasks is achieved through complex closed-box policies that are effective, but suffer from an inherent opacity. Interpretable policies, while generally underperforming compared to their closed-box counterparts, advantageously facilitate transparent decision-making within automated systems. Hence, their usage is often essential for diagnosing and mitigating errors, supporting ethical and legal accountability, and fostering trust among stakeholders. In this paper, we propose SMOSE, a novel method to train sparsely activated interpretable controllers, based on a top-1 Mixture-of-Experts architecture. SMOSE combines a set of interpretable decisionmakers, trained to be experts in different basic skills, and an interpretable router that assigns tasks among the experts. The training is carried out via state-of-the-art Reinforcement Learning algorithms, exploiting load-balancing techniques to ensure fair expert usage. We then distill decision trees from the weights of the router, significantly improving the ease of interpretation. We evaluate SMOSE on six benchmark environments from MuJoCo: our method outperforms recent interpretable baselines and narrows the gap with noninterpretable state-of-the-art algorithms
翻译:连续控制任务通常涉及高维、动态和非线性的环境。这些任务中的最先进性能是通过复杂黑盒策略实现的,这些策略虽然有效,但存在固有的不透明性。可解释策略虽然通常性能不及黑盒策略,但其优势在于能够促进自动化系统中的透明决策。因此,它们的使用对于诊断和减轻错误、支持伦理和法律问责以及培养利益相关者之间的信任通常至关重要。本文提出SMOSE,一种基于Top-1专家混合架构训练稀疏激活可解释控制器的新方法。SMOSE结合了一组可解释的决策器(训练为不同基本技能的专家)和一个可解释的路由器(负责在专家间分配任务)。训练通过最先进的强化学习算法进行,并利用负载均衡技术确保专家使用的公平性。随后,我们从路由器的权重中提取决策树,显著提升了可解释性。我们在MuJoCo的六个基准环境中评估SMOSE:我们的方法优于近期可解释基线,并缩小了与不可解释最先进算法的性能差距。