Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation. While this has been well investigated in Computer Vision, in tasks such as image disentanglement, in the NLP domain sentence disentanglement is still comparatively under-investigated. Most previous work have concentrated on disentangling task-specific generative factors, such as sentiment, within the context of style transfer. In this work, we focus on a more general form of sentence disentanglement, targeting the localised modification and control of more general sentence semantic features. To achieve this, we contribute to a novel notion of sentence semantic disentanglement and introduce a flow-based invertible neural network (INN) mechanism integrated with a transformer-based language Autoencoder (AE) in order to deliver latent spaces with better separability properties. Experimental results demonstrate that the model can conform the distributed latent space into a better semantically disentangled sentence space, leading to improved language interpretability and controlled generation when compared to the recent state-of-the-art language VAE models.
翻译:解耦的潜在空间通常具有更好的语义可分性与几何特性,这能带来更强的可解释性与更可控的数据生成能力。尽管这一特性在计算机视觉领域(如图像解耦任务中)已得到充分研究,但在自然语言处理领域,句子解耦的相关研究仍相对不足。以往工作大多集中于在风格迁移场景下解耦任务特定的生成因子(如情感)。本研究致力于探索一种更通用的句子解耦形式,旨在实现对更广义句子语义特征的局部化修改与控制。为此,我们提出了一种新颖的句子语义解耦概念,并引入一种基于流的可逆神经网络机制,该机制与基于Transformer的语言自编码器相结合,以构建具有更优可分性特性的潜在空间。实验结果表明,相较于当前最先进的语言变分自编码器模型,本模型能够将分布式潜在空间转化为语义解耦程度更高的句子空间,从而显著提升语言可解释性与生成过程的可控性。