Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.
翻译:提升扩散语言模型(DLM)的可控性、可移植性和推理速度是自然语言生成领域的关键挑战。尽管近期研究在复杂文本生成方面取得了显著成功,但模型对内存和计算能力的需求仍极为严苛且未达预期,这自然导致模型可移植性低、稳定性差。为缓解这些问题,学界提出了诸多成熟的神经网络量化方法。为进一步增强模型独立部署的可移植性,并提升由语言困惑度评估的稳定性,我们提出了一种名为量化嵌入可控扩散语言模型(QE-CDLM)的新方法。QE-CDLM基于近期成功的可控DLM,通过量化重塑任务特定嵌入空间,从而为生成任务构建基于梯度的控制器,并获得更稳定的中间潜变量,这自然带来了加速收敛和更好的可控性。此外,采用自适应微调方法以减少可调权重。在五项具有挑战性的细粒度控制任务上的实验结果表明,QE-CDLM在质量和可行性上优于现有方法,实现了更优的困惑度和轻量化微调。