Designing novel inorganic materials through generative models remains an important challenge for material science, driven by the complexity and diversity of inorganic structures across expansive chemical compositions and structural landscape. The vast combinatorial space of inorganic compounds demands innovative, AI-driven approaches to overcome limitations in generative accuracy and efficiency. To address this, we introduce a novel method that redefines the encoding and generation of inorganic materials by utilizing domain-specific symmetry-aware representation. Our approach not only refines the representation of intricate inorganic structures but also contributes to the field of material discovery by enhancing the precision and stability of generated candidates. Central to our methodology is a novel padding technique that exploits crystal symmetry information to enhance the encoding process. By integrating Wyckoff position length-aware padding into an encoder architecture, we achieve a more robust informed representation of inorganic materials. This symmetry-driven enhancement improves deep learning models to generate stable, previously unexplored inorganic structures with superior accuracy and computational efficiency. Furthermore, we introduce an end-to-end system that leverages the machine learning potential models to seamlessly generate novel, even those unseen in the training data, and stable inorganic materials from initial data to validated output. This pipeline integrates advanced generative models with stability analysis, marking a significant leap forward in the automated exploration and design of next-generation inorganic materials. Our method improved reconstruction accuracy 5.3% in proton conductor data, and generated 63.5% more novel stable inorganic material to baseline model on the perov-5 dataset.
翻译:通过生成模型设计新型无机材料仍是材料科学中的重要挑战,其根源在于无机结构在广阔化学组成和结构空间中呈现出的复杂性与多样性。无机化合物庞大的组合空间要求采用创新性的AI驱动方法,以突破生成精度与效率的瓶颈。为此,我们提出了一种新型方法,通过利用领域特定的对称感知表征,重新定义了无机材料的编码与生成过程。该方法不仅优化了复杂无机结构的表征方式,还通过提升生成候选材料的精度与稳定性,为材料发现领域做出贡献。该方法的核心是一种创新的填充技术,该技术利用晶体对称性信息增强编码过程。通过将Wyckoff位置长度感知填充法集成至编码器架构中,我们实现了对无机材料更具鲁棒性的信息表征。这种对称性驱动的增强机制提升了深度学习模型生成稳定且前所未有的无机结构的能力,在保证高精度的同时兼具计算效率。此外,我们引入了一个端到端系统,该系统利用机器学习势模型,能够从初始数据到验证输出无缝生成新型(甚至包含训练数据中未见过的)稳定无机材料。该流程将先进生成模型与稳定性分析相结合,标志着下一代无机材料自动化探索与设计的重大飞跃。我们的方法在质子导体数据上将重构精度提升了5.3%,并在perov-5数据集上相比基线模型多生成了63.5%的新型稳定无机材料。