连续唯一性与新颖性度量在无机晶体生成建模中的应用 (Continuous Uniqueness and Novelty Metrics for Generative Modeling of Inorganic Crystals)

To address pressing scientific challenges such as climate change, increasingly sophisticated generative artificial intelligence models are being developed that can efficiently sample the large chemical space of possible functional materials. These models can quickly sample new chemical compositions paired with crystal structures. They are typically evaluated using uniqueness and novelty metrics, which depend on a chosen crystal distance function. However, the most prevalent distance function has four limitations: it fails to quantify the degree of similarity between compounds, cannot distinguish compositional difference and structural difference, lacks Lipschitz continuity against shifts in atomic coordinates, and results in a uniqueness metric that is not invariant against the permutation of generated samples. In this work, we propose using two continuous distance functions to evaluate uniqueness and novelty, which theoretically overcome these limitations. Our experiments show that these distances reveal insights missed by traditional distance functions, providing a more reliable basis for evaluating and comparing generative models for inorganic crystals.

翻译：为应对气候变化等紧迫的科学挑战，人们正在开发日益复杂的生成式人工智能模型，以高效采样可能的功能材料所对应的广阔化学空间。这些模型能够快速采样与晶体结构配对的新化学成分。通常使用唯一性和新颖性度量来评估这些模型，这些度量依赖于选定的晶体距离函数。然而，最常用的距离函数存在四个局限性：无法量化化合物之间的相似程度，不能区分成分差异与结构差异，缺乏关于原子坐标平移的Lipschitz连续性，以及导致唯一性度量对生成样本的排列不具备不变性。在本工作中，我们提出使用两个连续距离函数来评估唯一性和新颖性，理论上克服了这些局限性。实验表明，这些距离函数能够揭示传统距离函数所忽略的洞见，为评估和比较无机晶体的生成模型提供了更可靠的基础。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/