Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as program fuzzing, where both validity and diversity of samples are essential. We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of RS without distributional distortion. CARS begins with unconstrained LM sampling and adaptively rules out constraint-violating continuations by recording them in a trie and subtracting their probability mass from future draws. This adaptive pruning ensures that prefixes proven invalid are never revisited, acceptance rates improve monotonically, and the resulting samples exactly follow the constrained distribution. In experiments on a variety of domains -- e.g., program fuzzing and molecular generation -- CARS consistently achieves higher efficiency -- measured in the number of LM forward passes per valid sample -- while also producing stronger sample diversity than both GCD and methods that approximate the LM's distribution.
翻译:语言模型(LM)在生成输出必须满足严格语义或句法约束的应用中日益普及。现有的约束生成方法大致分为两类:贪婪约束解码方法在解码过程中强制保证有效性,但会扭曲LM的分布;而拒绝采样(RS)虽能保持分布保真度,却因丢弃无效输出而浪费计算资源。在程序模糊测试等对样本有效性和多样性均至关重要的领域中,这两种极端方法均存在问题。本文提出约束自适应拒绝采样(CARS),该方法在不引入分布畸变的前提下,严格提升了RS的采样效率。CARS从无约束LM采样开始,通过将违反约束的后续序列记录于字典树并从未来采样中扣除其概率质量,实现自适应排除。这种自适应剪枝机制确保已证明无效的前缀绝不会被重复访问,接受率单调提升,且最终样本严格遵循约束分布。在程序模糊测试和分子生成等多个领域的实验中,CARS始终以更高的效率——以每有效样本所需的LM前向传播次数衡量——超越贪婪约束解码及近似LM分布的方法,同时产生更具多样性的样本。