Learning transparent, interpretable controllers with offline data in decision-making systems is an essential area of research due to its potential to reduce the risk of applications in real-world systems. However, in responsibility-sensitive settings such as healthcare, decision accountability is of paramount importance, yet has not been adequately addressed by the literature. This paper introduces the Accountable Offline Controller (AOC) that employs the offline dataset as the Decision Corpus and performs accountable control based on a tailored selection of examples, referred to as the Corpus Subset. ABC operates effectively in low-data scenarios, can be extended to the strictly offline imitation setting, and displays qualities of both conservation and adaptability. We assess ABC's performance in both simulated and real-world healthcare scenarios, emphasizing its capability to manage offline control tasks with high levels of performance while maintaining accountability. Keywords: Interpretable Reinforcement Learning, Explainable Reinforcement Learning, Reinforcement Learning Transparency, Offline Reinforcement Learning, Batched Control.
翻译:在决策系统中利用离线数据学习透明、可解释的控制器,是一个重要的研究领域,因其具有降低现实世界系统应用风险的潜力。然而,在医疗保健等责任敏感环境中,决策问责至关重要,但现有文献尚未充分解决这一问题。本文提出了可问责离线控制器(AOC),该方法将离线数据集作为决策语料库,并基于精心挑选的实例子集(称为语料库子集)执行可问责控制。ABC在低数据场景下运行高效,可扩展到严格的离线模仿学习设置,并展现出保守性与适应性的双重特质。我们在模拟和真实医疗保健场景中评估了ABC的性能,强调其在保持问责性的同时,能够以高绩效管理离线控制任务的能力。关键词:可解释强化学习,可解释强化学习,强化学习透明性,离线强化学习,批量控制。