The emergence of Large Language Models (LLMs) has inspired the vision of generating bespoke crystal materials directly from natural-language instructions, enabling users to design materials through intuitive, conversational interaction. Existing text-to-crystal generative models represent important early steps toward this goal, but they suffer from two critical limitations: (i) restricted input formats that require highly structured descriptions (e.g., chemical formulas), and (ii) one-directional generation, where models can map text to crystal but cannot perform the inverse. These limitations prevent fully conversational workflows and hinder alignment with users' inherently ambiguous and evolving desiderata. We address these challenges with LapidaryEngine, the first model to support fully conversational crystal generation. LapidaryEngine accepts free-form natural-language requests and performs iterative refinement and editing in a dialogue-like manner. The key innovation is a pivot representation, a third, intermediate form that enables bidirectional translation between text and crystal structures despite the absence of direct paired datasets. Leveraging this pivot allows robust interpretation of user feedback and precise structural control. We demonstrate LapidaryEngine across diverse tasks, including insulator discovery, stability optimization, compositional modification, and structural editing, showcasing its ability to align generated materials with user intent in an interactive manner.
翻译:大型语言模型(LLMs)的出现激发了从自然语言指令直接生成定制晶体材料的愿景,使用户能够通过直观的对话式交互来设计材料。现有的文本到晶体生成模型是为实现这一目标的重要早期步骤,但它们存在两个关键限制:(i)输入格式受限,需要高度结构化的描述(例如化学式);(ii)单向生成,模型能将文本映射为晶体,但无法执行逆向操作。这些限制阻碍了全对话式工作流程,并使模型难以与用户内在模糊且不断演变的需求对齐。我们通过LapidaryEngine应对这些挑战,这是首个支持全对话式晶体生成的模型。LapidaryEngine接受自由形式的自然语言请求,并以类似对话的方式进行迭代精炼和编辑。其关键创新在于一种枢轴表示,即第三种中间形式,它能在缺乏直接配对数据集的情况下实现文本与晶体结构之间的双向转换。利用此枢轴,模型能够稳健地解释用户反馈并实现精确的结构控制。我们在绝缘体发现、稳定性优化、成分修改和结构编辑等多种任务上展示了LapidaryEngine,凸显了其以交互方式使生成材料与用户意图对齐的能力。