Zero-shot mutation prediction is vital for low-resource protein engineering, yet existing protein language models (PLMs) often yield statistically confident results that ignore fundamental biophysical constraints. Currently, selecting candidates for wet-lab validation relies on manual expert auditing of PLM outputs, a process that is inefficient, subjective, and highly dependent on domain expertise. To address this, we propose Rank-and-Reason (VenusRAR), a two-stage agentic framework to automate this workflow and maximize expected wet-lab fitness. In the Rank-Stage, a Computational Expert and Virtual Biologist aggregate a context-aware multi-modal ensemble, establishing a new Spearman correlation record of 0.551 (vs. 0.518) on ProteinGym. In the Reason-Stage, an agentic Expert Panel employs chain-of-thought reasoning to audit candidates against geometric and structural constraints, improving the Top-5 Hit Rate by up to 367% on ProteinGym-DMS99. The wet-lab validation on Cas12i3 nuclease further confirms the framework's efficacy, achieving a 46.7% positive rate and identifying two novel mutants with 4.23-fold and 5.05-fold activity improvements. Code and datasets are released on GitHub (https://github.com/ai4protein/VenusRAR/).
翻译:零样本突变预测对于低资源蛋白质工程至关重要,然而现有的蛋白质语言模型(PLMs)常产生统计上可信但忽略基本生物物理约束的结果。目前,选择用于湿实验验证的候选突变依赖于专家对PLM输出的人工审核,这一过程效率低下、主观性强且高度依赖领域专业知识。为解决此问题,我们提出了排序与推理(VenusRAR),一个两阶段的智能体框架,旨在自动化此工作流程并最大化预期的湿实验适应度。在排序阶段,计算专家与虚拟生物学家通过上下文感知的多模态集成方法,在ProteinGym上建立了0.551(对比0.518)的斯皮尔曼相关性新记录。在推理阶段,一个专家小组智能体运用思维链推理,依据几何与结构约束审核候选突变,在ProteinGym-DMS99上将Top-5命中率提升了高达367%。对Cas12i3核酸酶的湿实验验证进一步证实了该框架的有效性,实现了46.7%的阳性率,并鉴定出两个活性分别提升4.23倍和5.05倍的新型突变体。代码与数据集已在GitHub(https://github.com/ai4protein/VenusRAR/)上发布。