This paper introduces ACS2HER, a novel integration of the Anticipatory Classifier System (ACS2) with the Hindsight Experience Replay (HER) mechanism. While ACS2 is highly effective at building cognitive maps through latent learning, its performance often stagnates in environments characterized by sparse rewards. We propose a specific architectural variant that triggers hindsight learning when the agent fails to reach its primary goal, re-labeling visited states as virtual goals to densify the learning signal. The proposed model was evaluated on two benchmarks: the deterministic \texttt{Maze 6} and the stochastic \texttt{FrozenLake}. The results demonstrate that ACS2HER significantly accelerates knowledge acquisition and environmental mastery compared to the standard ACS2. However, this efficiency gain is accompanied by increased computational overhead and a substantial expansion in classifier numerosity. This work provides the first analysis of combining anticipatory mechanisms with retrospective goal-relabeling in Learning Classifier Systems.
翻译:本文介绍了ACS2HER,一种将预期分类器系统(ACS2)与后见经验回放(HER)机制相结合的新颖方法。尽管ACS2通过潜在学习构建认知地图非常有效,但其在稀疏奖励环境中的性能常常停滞不前。我们提出了一种特定的架构变体,当智能体未能达成其主要目标时触发后见学习,将访问过的状态重新标记为虚拟目标以增强学习信号的密度。所提出的模型在两个基准测试上进行了评估:确定性的 \texttt{Maze 6} 和随机的 \texttt{FrozenLake}。结果表明,与标准ACS2相比,ACS2HER显著加快了知识获取和环境掌握速度。然而,这种效率提升伴随着计算开销的增加以及分类器数量的大幅扩张。本研究首次分析了在学习分类器系统中将预期机制与回顾性目标重标记相结合的效果。