Learning Automaton (LA) is an adaptive self-organized model that improves its action-selection through interaction with an unknown environment. LA with finite action set can be classified into two main categories: fixed and variable structure. Furthermore, variable action-set learning automaton (VASLA) is one of the main subsets of variable structure learning automaton. In this paper, we propose VDHLA, a novel hybrid learning automaton model, which is a combination of fixed structure and variable action set learning automaton. In the proposed model, variable action set learning automaton can increase, decrease, or leave unchanged the depth of fixed structure learning automaton during the action switching phase. In addition, the depth of the proposed model can change in a symmetric (SVDHLA) or asymmetric (AVDHLA) manner. To the best of our knowledge, it is the first hybrid model that intelligently changes the depth of fixed structure learning automaton. Several computer simulations are conducted to study the performance of the proposed model with respect to the total number of rewards and action switching in stationary and non-stationary environments. The proposed model is compared with FSLA and VSLA. In order to determine the performance of the proposed model in a practical application, the selfish mining attack which threatens the incentive-compatibility of a proof-of-work based blockchain environment is considered. The proposed model is applied to defend against the selfish mining attack in Bitcoin and compared with the tie-breaking mechanism, which is a well-known defense. Simulation results in all environments have shown the superiority of the proposed model.
翻译:学习自动机(LA)是一种自适应自组织模型,通过与未知环境的交互改进其动作选择。具有有限动作集的LA可分为两类:固定结构与可变结构。此外,可变动作集学习自动机(VASLA)是可变结构学习自动机的主要子集之一。本文提出了一种新型混合学习自动机模型VDHLA,该模型融合了固定结构与可变动作集学习自动机。在所提出的模型中,可变动作集学习自动机可在动作切换阶段增加、减少或保持固定结构学习自动机的深度不变。此外,该模型的深度可对称变化(SVDHLA)或非对称变化(AVDHLA)。据我们所知,这是首个智能改变固定结构学习自动机深度的混合模型。通过多组计算机仿真实验,研究了该模型在平稳与非平稳环境中关于总奖励次数和动作切换的性能表现。所提模型与FSLA和VSLA进行了对比。为验证该模型在实际应用中的性能,考虑了威胁基于工作量证明的区块链环境激励兼容性的自私挖矿攻击。将该模型应用于比特币中抵御自私挖矿攻击,并与著名的防御机制——打破平局机制进行比较。所有环境下的仿真结果均表明所提模型具有优越性。