In this paper, we address the challenge of obtaining a comprehensive and symmetric representation of point particle groups, such as atoms in a molecule, which is crucial in physics and theoretical chemistry. The problem has become even more important with the widespread adoption of machine-learning techniques in science, as it underpins the capacity of models to accurately reproduce physical relationships while being consistent with fundamental symmetries and conservation laws. However, some of the descriptors that are commonly used to represent point clouds -- most notably those based on discretized correlations of the neighbor density, that underpin most of the existing ML models of matter at the atomic scale -- are unable to distinguish between special arrangements of particles in three dimensions. This makes it impossible to machine learn their properties. Atom-density correlations are provably complete in the limit in which they simultaneously describe the mutual relationship between all atoms, which is impractical. We present a novel approach to construct descriptors of \emph{finite} correlations based on the relative arrangement of particle triplets, which can be employed to create symmetry-adapted models with universal approximation capabilities, which have the resolution of the neighbor discretization as the sole convergence parameter. Our strategy is demonstrated on a class of atomic arrangements that are specifically built to defy a broad class of conventional symmetric descriptors, showcasing its potential for addressing their limitations.
翻译:本文探讨了获取点粒子群(如分子中的原子)的全面且对称表示这一挑战,这在物理学和理论化学中至关重要。随着机器学习技术在科学领域的广泛采用,该问题变得更为重要,因为这决定了模型在符合基本对称性和守恒定律的同时,准确再现物理关系的能力。然而,一些常用于表示点云的描述符——尤其是基于邻域密度离散化相关性的描述符,它们支撑了大多数现有的原子尺度物质机器学习模型——无法区分三维空间中粒子的特殊排列方式。这使得无法通过机器学习来学习它们的性质。在同时描述所有原子之间相互关系的极限条件下,原子密度相关性被证明是完备的,但这在实际中并不可行。我们提出了一种新颖的方法,基于粒子三元组的相对排列构建有限相关性的描述符,该方法可用于创建具有通用逼近能力的对称自适应模型,其中邻域离散化分辨率是唯一的收敛参数。我们的策略在一类特意构造的、旨在挑战传统对称描述符的原子排列上得到了验证,展示了其解决这些描述符局限性的潜力。