Designing novel inorganic materials through generative models remains an important challenge for material science, driven by the complexity and diversity of inorganic structures across expansive chemical compositions and structural landscape. The vast combinatorial space of inorganic compounds demands innovative, AI-driven approaches to overcome limitations in generative accuracy and efficiency. To address this, we introduce a novel method that redefines the encoding and generation of inorganic materials by utilizing domain-specific symmetry-aware representation. Our approach not only refines the representation of intricate inorganic structures but also contributes to the field of material discovery by enhancing the precision and stability of generated candidates. Central to our methodology is a novel padding technique that exploits crystal symmetry information to enhance the encoding process. By integrating Wyckoff position length-aware padding into an encoder architecture, we achieve a more robust informed representation of inorganic materials. This symmetry-driven enhancement improves deep learning models to generate stable, previously unexplored inorganic structures with superior accuracy and computational efficiency. Furthermore, we introduce an end-to-end system that leverages the machine learning potential models to seamlessly generate novel, even those unseen in the training data, and stable inorganic materials from initial data to validated output. This pipeline integrates advanced generative models with stability analysis, marking a significant leap forward in the automated exploration and design of next-generation inorganic materials. Our method improved reconstruction accuracy 5.3% in proton conductor data, and generated 63.5% more novel stable inorganic material to baseline model on the perov-5 dataset.
翻译:通过生成模型设计新型无机材料仍是材料科学的重要挑战,这源于无机结构在广泛的化学成分和结构空间中呈现出的复杂性与多样性。无机化合物的巨大组合空间需要创新性的AI驱动方法,以克服生成准确性和效率方面的局限性。为此,我们提出一种新方法,通过利用具有领域感知的对称性表征来重新定义无机材料的编码与生成过程。我们的方法不仅优化了复杂无机结构的表示,还通过提升生成候选材料的精度和稳定性推动了材料发现领域的发展。该方法的核心是一种利用晶体对称性信息增强编码过程的新型填充技术。通过将Wyckoff位置长度感知填充集成到编码器架构中,我们实现了对无机材料更鲁棒、更具信息性的表征。这种对称性驱动的增强使深度学习模型能够以更优的准确性和计算效率生成稳定且此前未被探索的无机结构。此外,我们引入了一个端到端系统,该系统利用机器学习势模型,从初始数据到验证输出无缝生成新型(甚至训练数据中未见的)稳定无机材料。该流程将先进生成模型与稳定性分析相结合,标志着下一代无机材料自动化探索与设计的重大飞跃。我们的方法在质子导体数据上将重构准确率提升了5.3%,并在perov-5数据集上相比基线模型多生成了63.5%的新型稳定无机材料。