We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation via coarse-to-fine next-scale prediction. Using the hierarchical nature of proteins, PAR generates structures that mimic sculpting a statue, forming a coarse topology and refining structural details over scales. To achieve this, PAR consists of three key components: (i) multi-scale downsampling operations that represent protein structures across multiple scales during training; (ii) an autoregressive transformer that encodes multi-scale information and produces conditional embeddings to guide structure generation; (iii) a flow-based backbone decoder that generates backbone atoms conditioned on these embeddings. Moreover, autoregressive models suffer from exposure bias, caused by the training and the generation procedure mismatch, and substantially degrades structure generation quality. We effectively alleviate this issue by adopting noisy context learning and scheduled sampling, enabling robust backbone generation. Notably, PAR exhibits strong zero-shot generalization, supporting flexible human-prompted conditional generation and motif scaffolding without requiring fine-tuning. On the unconditional generation benchmark, PAR effectively learns protein distributions and produces backbones of high design quality, and exhibits favorable scaling behavior. Together, these properties establish PAR as a promising framework for protein structure generation.
翻译:我们提出了蛋白质自回归建模(PAR),这是首个通过从粗到细的下一尺度预测实现蛋白质主链生成的多尺度自回归框架。利用蛋白质的层次化特性,PAR生成结构的过程类似于雕刻塑像:先形成粗粒度拓扑,再跨尺度细化结构细节。为实现这一目标,PAR包含三个关键组件:(i)多尺度下采样操作,在训练过程中表征多尺度蛋白质结构;(ii)自回归Transformer编码器,用于编码多尺度信息并生成指导结构生成的条件嵌入;(iii)基于流的骨架解码器,根据这些嵌入生成主链原子。此外,自回归模型存在暴露偏差问题——由训练与生成过程不匹配引起——这会显著降低结构生成质量。我们通过采用噪声上下文学习和计划采样有效缓解了该问题,实现了鲁棒的主链生成。值得注意的是,PAR展现出强大的零样本泛化能力,支持灵活的人工提示条件生成和基序支架构建,且无需微调。在无条件生成基准测试中,PAR有效学习了蛋白质分布,生成了具有高设计质量的主链结构,并展现出良好的缩放特性。这些特性共同确立了PAR作为蛋白质结构生成框架的广阔前景。