Inverse materials design has proven successful in accelerating novel material discovery. Many inverse materials design methods use unsupervised learning where a latent space is learned to offer a compact description of materials representations. A latent space learned this way is likely to be entangled, in terms of the target property and other properties of the materials. This makes the inverse design process ambiguous. Here, we present a semi-supervised learning approach based on a disentangled variational autoencoder to learn a probabilistic relationship between features, latent variables and target properties. This approach is data efficient because it combines all labelled and unlabelled data in a coherent manner, and it uses expert-informed prior distributions to improve model robustness even with limited labelled data. It is in essence interpretable, as the learnable target property is disentangled out of the other properties of the materials, and an extra layer of interpretability can be provided by a post-hoc analysis of the classification head of the model. We demonstrate this new approach on an experimental high-entropy alloy dataset with chemical compositions as input and single-phase formation as the single target property. High-entropy alloys were chosen as example materials because of the vast chemical space of their possible combinations of compositions and atomic configurations. While single property is used in this work, the disentangled model can be extended to customize for inverse design of materials with multiple target properties.
翻译:逆向材料设计已被证明能有效加速新型材料的发现。许多逆向材料设计方法采用无监督学习,通过学习潜在空间来提供材料表征的紧凑描述。以这种方式学习的潜在空间在目标属性与材料的其他属性方面很可能是纠缠的,这导致逆向设计过程存在模糊性。本文提出一种基于解缠变分自编码器的半监督学习方法,用于学习特征、潜在变量与目标属性之间的概率关系。该方法具有数据高效性,因为它以连贯的方式整合所有标注和未标注数据,并利用专家先验分布提升模型鲁棒性,即使在标注数据有限的情况下也能保持性能。该方法本质上是可解释的——可学习的材料目标属性与其他属性实现解耦,且通过模型分类头的后验分析可提供额外的可解释性层次。我们在以化学成分为输入、以单相形成为单一目标属性的实验性高熵合金数据集上验证了这一新方法。选择高熵合金作为示例材料,是因为其成分与原子构型的可能组合构成广阔的化学空间。尽管本研究仅使用单一属性,但所提出的解缠模型可扩展至具有多目标属性的材料逆向设计定制任务。