NeurAlly-Decomposed Oracle (NADO) is a powerful approach for controllable generation with large language models. Differentiating from finetuning/prompt tuning, it has the potential to avoid catastrophic forgetting of the large base model and achieve guaranteed convergence to an entropy-maximized closed-form solution without significantly limiting the model capacity. Despite its success, several challenges arise when applying NADO to more complex scenarios. First, the best practice of using NADO for the composition of multiple control signals is under-explored. Second, vanilla NADO suffers from gradient vanishing for low-probability control signals and is highly reliant on the forward-consistency regularization. In this paper, we study the aforementioned challenges when using NADO theoretically and empirically. We show we can achieve guaranteed compositional generalization of NADO with a certain practice, and propose a novel alternative parameterization of NADO to perfectly guarantee the forward-consistency. We evaluate the improved training of NADO, i.e. NADO++, on CommonGen. Results show that NADO++ improves the effectiveness of the algorithm in multiple aspects.
翻译:神经分解式预言机(NeurAlly-Decomposed Oracle, NADO)是大语言模型可控生成的一种强大方法。与微调/提示调优不同,该方法具有避免大型基础模型灾难性遗忘的潜力,并能确保收敛至熵最大化闭式解,同时不显著限制模型容量。尽管取得了成功,但在将NADO应用于更复杂场景时仍面临若干挑战。首先,如何最佳实践多控制信号组合使用NADO尚待探索。其次,基础NADO在低概率控制信号下存在梯度消失问题,且高度依赖前向一致性正则化。本文从理论与实证角度研究了上述挑战。我们证明,通过特定实践可以实现NADO的可保证组合泛化,并提出一种新颖的NADO替代参数化方法,以完美保证前向一致性。我们在CommonGen数据集上评估了优化后的NADO(即NADO++)。结果表明,NADO++在多个方面提升了算法的有效性。