We propose reinforcement learning to control the dynamical self-assembly of the dodecagonal quasicrystal (DDQC) from patchy particles. The patchy particles have anisotropic interactions with other particles and form DDQC. However, their structures at steady states are significantly influenced by the kinetic pathways of their structural formation. We estimate the best policy of temperature control trained by the Q-learning method and demonstrate that we can generate DDQC with few defects using the estimated policy. The temperature schedule obtained by reinforcement learning can reproduce the desired structure more efficiently than the conventional pre-fixed temperature schedule, such as annealing. To clarify the success of the learning, we also analyse a simple model describing the kinetics of structural changes through the motion in a triple-well potential. We have found that reinforcement learning autonomously discovers the critical temperature at which structural fluctuations enhance the chance of forming a globally stable state. The estimated policy guides the system toward the critical temperature to assist the formation of DDQC.
翻译:我们提出了使用强化学习来控制由补丁粒子组成的十二边形准晶(DDQC)的动态自组装过程。这些补丁粒子与其他粒子具有各向异性相互作用,并形成DDQC。然而,它们在稳态下的结构受到其结构形成动力学路径的显著影响。我们通过Q学习方法估计了温度控制的最优策略,并证明利用该策略能够生成缺陷极少的DDQC。通过强化学习获得的温度调控方案比传统的预固定温度方案(如退火)能更高效地重现所需结构。为阐明学习的成功机制,我们还分析了描述结构变化动力学的简单模型(基于三势阱势能中的运动)。研究发现,强化学习自主发现了临界温度——在此温度下,结构涨落增强了形成全局稳定态的概率。估计得到的最优策略引导系统趋近该临界温度,从而促进DDQC的形成。