There exist several techniques for representing the chess board inside the computer. In the first part of this paper, the concepts of the bitboard-representation and the advantages of (rotated) bitboards in move generation are explained. In order to illustrate those ideas practice, the concrete implementation of the move-generator in FUSc# is discussed and we explain a technique how to verify the move-generator with the "perft"-command. We show that the move-generator of FUSc# works 100% correct. The second part of this paper deals with reinforcement learning in computer chess (and beyond). We exemplify the progress that has been made in this field in the last 15-20 years by comparing the "state of the art" from 2002-2008, when FUSc# was developed, with recent innovations connected to "AlphaZero". We discuss how a "FUSc#-Zero" could be implemented and what would be necessary to reduce the number of training games necessary to achieve a good performance. This can be seen as a test case to the general prblem of improving "sample effciency" in reinforcement learning. In the final part, we move beyond computer chess, as the importance of sample effciency extends far beyond board games into a wide range of applications where data is costly, diffcult to obtain, or time consuming to generate. We review some application of the ideas developed in AlphaZero in other domains, i.e. the "other Alphas" like AlphaFold, AlphaTensor, AlphaGeometry and AlphaProof. We also discuss future research and the potential for such methods for ecological economic planning.
翻译:在计算机内部表示棋盘存在多种技术。本文第一部分阐述了位棋盘表示法的概念以及(旋转)位棋盘在走子生成中的优势。为在实践中说明这些思想,我们讨论了FUSc#中走子生成器的具体实现,并解释了一种通过"perft"命令验证走子生成器的技术。我们证明FUSc#的走子生成器具有100%的正确性。本文第二部分探讨计算机象棋(及其他领域)的强化学习。通过对比2002-2008年FUSc#开发时期的"技术现状"与近期"AlphaZero"相关创新,我们例证了过去15-20年该领域取得的进展。我们讨论了"FUSc#-Zero"的可能实现方案,以及减少训练对局数量以达到良好性能所需的条件。这可视为改进强化学习中"样本效率"这一普遍问题的测试案例。最后部分我们将视野拓展至计算机象棋之外,因为样本效率的重要性远超越棋盘游戏,延伸至数据获取成本高、难度大或生成耗时的广泛应用领域。我们回顾了AlphaZero开发思想在其他领域的应用案例,例如"其他Alpha系列"——AlphaFold、AlphaTensor、AlphaGeometry和AlphaProof。同时探讨了未来研究方向以及此类方法在生态经济规划中的潜力。