Deep generative neural networks, such as Variational AutoEncoders (VAEs), offer an opportunity to better understand and control language models from the perspective of sentence-level latent spaces. To combine the controllability of VAE latent spaces with the state-of-the-art performance of recent large language models (LLMs), we present in this work LlaMaVAE, which combines expressive encoder and decoder models (sentenceT5 and LlaMA) with a VAE architecture, aiming to provide better text generation control to LLMs. In addition, to conditionally guide the VAE generation, we investigate a new approach based on flow-based invertible neural networks (INNs) named Invertible CVAE. Experimental results reveal that LlaMaVAE can outperform the previous state-of-the-art VAE language model, Optimus, across various tasks, including language modelling, semantic textual similarity and definition modelling. Qualitative analysis on interpolation and traversal experiments also indicates an increased degree of semantic clustering and geometric consistency, which enables better generation control.
翻译:深度生成神经网络(如变分自编码器VAE)为从句级潜在空间角度理解和控制语言模型提供了契机。为了将VAE潜在空间的可控性与当前大语言模型(LLMs)的最先进性能相结合,本文提出LlaMaVAE——该模型将具有表达能力的编码器与解码器模型(sentenceT5和LlaMA)集成到VAE架构中,旨在为LLMs提供更优的文本生成控制能力。此外,为条件性引导VAE生成,我们探索了一种基于流可逆神经网络(INNs)的新方法,即可逆条件变分自编码器(Invertible CVAE)。实验结果表明,在语言建模、语义文本相似度和定义建模等多项任务中,LlaMaVAE均优于先前最先进的VAE语言模型Optimus。插值与遍历实验的定性分析也表明,模型具有更强的语义聚类和几何一致性,从而实现了更优的生成控制能力。