Discrete biological sequence optimization requires iterative refinement under strict syntactic constraints. Diffusion models offer progressive refinement but do not naturally expose controllable discrete edit operations, while autoregressive LLMs often lack explicit long-horizon planning for constrained edits. We propose STRIDE (Sequence Trajectory Refinement via Internalized Denoising Emulation), a post-training framework that trains an LLM to emit executable trajectories of atomic edits (INSERT/DELETE/REPLACE) as a verifiable reasoning trace for variable-length refinement. STRIDE combines supervised fine-tuning on Levenshtein-aligned shortest edit demonstrations with group-based policy optimization to align edit trajectories with task rewards while preserving coherent editing behavior. Across protein fluorescence and instruction-conditioned molecular optimization, STRIDE improves variable-length protein editing success from 42% to 89% while increasing novelty from 47% to 97%, and yields stronger validity and controllability compared to diverse baselines. The code is published at https://github.com/daiheng-zhang/STRIDE.
翻译:离散生物序列优化需要在严格语法约束下进行迭代精炼。扩散模型提供渐进式精炼,但无法自然实现可控的离散编辑操作,而自回归大语言模型通常缺乏针对约束编辑的显式长程规划。我们提出STRIDE(基于内部去噪模拟的序列轨迹精炼框架),这是一种后训练框架,通过训练大语言模型生成可执行的原子编辑轨迹(插入/删除/替换),作为可变长度精炼的可验证推理轨迹。STRIDE结合了基于莱文斯坦对齐的最短编辑演示的监督微调与基于群体的策略优化,在保持连贯编辑行为的同时,使编辑轨迹与任务奖励对齐。在蛋白质荧光与指令条件分子优化任务中,STRIDE将可变长度蛋白质编辑成功率从42%提升至89%,同时将新颖性从47%提高至97%,相较于多种基线方法展现出更强的有效性与可控性。代码发布于https://github.com/daiheng-zhang/STRIDE。