Achieving precise semantic control over the latent spaces of Variational AutoEncoders (VAEs) holds significant value for downstream tasks in NLP as the underlying generative mechanisms could be better localised, explained and improved upon. Recent research, however, has struggled to achieve consistent results, primarily due to the inevitable loss of semantic information in the variational bottleneck and limited control over the decoding mechanism. To overcome these challenges, we investigate discrete latent spaces in Vector Quantized Variational AutoEncoders (VQVAEs) to improve semantic control and generation in Transformer-based VAEs. In particular, We propose T5VQVAE, a novel model that leverages the controllability of VQVAEs to guide the self-attention mechanism in T5 at the token-level, exploiting its full generalization capabilities. Experimental results indicate that T5VQVAE outperforms existing state-of-the-art VAE models, including Optimus, in terms of controllability and preservation of semantic information across different tasks such as auto-encoding of sentences and mathematical expressions, text transfer, and inference. Moreover, T5VQVAE exhibits improved inference capabilities, suggesting potential applications for downstream natural language and symbolic reasoning tasks.
翻译:实现对变分自编码器潜在空间的精确语义控制,对于自然语言处理中的下游任务具有重要价值,因为底层生成机制能够得到更好的定位、解释和改进。然而,近期研究难以获得一致的结果,主要原因在于变分瓶颈中不可避免的语义信息损失以及对解码机制的控制有限。为克服这些挑战,我们研究了向量量化变分自编码器中的离散潜在空间,以改进基于Transformer的变分自编码器的语义控制与生成能力。具体而言,我们提出了T5VQVAE,一种新颖模型,该模型利用VQVAE的可控性在token级别引导T5的自注意力机制,充分发挥其泛化能力。实验结果表明,在诸如句子与数学表达式的自动编码、文本迁移及推理等不同任务中,T5VQVAE在可控性和语义信息保持方面优于包括Optimus在内的现有最先进VAE模型。此外,T5VQVAE展现出更强的推理能力,为下游自然语言与符号推理任务提供了潜在应用前景。