We propose reinforcement learning to control the dynamical self-assembly of the dodecagonal quasicrystal (DDQC) from patchy particles. The patchy particles have anisotropic interactions with other particles and form DDQC. However, their structures at steady states are significantly influenced by the kinetic pathways of their structural formation. We estimate the best policy of temperature control trained by the Q-learning method and demonstrate that we can generate DDQC with few defects using the estimated policy. The temperature schedule obtained by reinforcement learning can reproduce the desired structure more efficiently than the conventional pre-fixed temperature schedule, such as annealing. To clarify the success of the learning, we also analyse a simple model describing the kinetics of structural changes through the motion in a triple-well potential. We have found that reinforcement learning autonomously discovers the critical temperature at which structural fluctuations enhance the chance of forming a globally stable state. The estimated policy guides the system toward the critical temperature to assist the formation of DDQC.
翻译:我们提出利用强化学习控制由补丁粒子构成的十二边形准晶(DDQC)的动态自组装过程。这些补丁粒子与其他粒子具有各向异性相互作用,并形成十二边形准晶。然而,它们在稳态下的结构显著受到结构形成动力学路径的影响。我们通过Q学习方法估算了温度控制的最优策略,并证明利用该估算策略可以生成缺陷极少的十二边形准晶。通过强化学习获得的温度调控方案比传统预设定温度方案(如退火)更高效地复现目标结构。为阐明学习成功的机理,我们还分析了一个描述结构变化动力学的简化模型(该模型通过三势阱势中的运动表征)。研究发现,强化学习能自主发现临界温度——在该温度下结构波动会提升形成全局稳定态的概率。估算策略引导系统趋向临界温度,从而促进十二边形准晶的形成。