This paper presents a new supervised representation learning framework, namely structured probabilistic coding (SPC), to learn compact and informative representations from input related to the target task. SPC is an encoder-only probabilistic coding technology with a structured regularization from the target label space. It can enhance the generalization ability of pre-trained language models for better language understanding. Specifically, our probabilistic coding technology simultaneously performs information encoding and task prediction in one module to more fully utilize the effective information from input data. It uses variational inference in the output space to reduce randomness and uncertainty. Besides, to better control the probability distribution in the latent space, a structured regularization is proposed to promote class-level uniformity in the latent space. With the regularization term, SPC can preserve the Gaussian distribution structure of latent code as well as better cover the hidden space with class uniformly. Experimental results on 12 natural language understanding tasks demonstrate that our SPC effectively improves the performance of pre-trained language models for classification and regression. Extensive experiments show that SPC can enhance the generalization capability, robustness to label noise, and clustering quality of output representations.
翻译:本文提出了一种新的监督表示学习框架——结构化概率编码(SPC),旨在从与目标任务相关的输入中学习紧凑且信息丰富的表示。SPC是一种仅含编码器的概率编码技术,结合了来自目标标签空间的结构化正则化。它能增强预训练语言模型的泛化能力,从而提升语言理解性能。具体而言,我们的概率编码技术在一个模块中同时执行信息编码和任务预测,以更充分地利用输入数据中的有效信息。它利用输出空间中的变分推断来减少随机性和不确定性。此外,为了更好地控制隐空间中的概率分布,我们提出了一种结构化正则化方法,以促进隐空间中类级别的均匀性。通过该正则化项,SPC既能保持隐编码的高斯分布结构,又能更均匀地覆盖类隐空间。在12个自然语言理解任务上的实验结果表明,我们的SPC有效提升了预训练语言模型在分类和回归任务中的性能。大量实验证明,SPC能够增强输出表示的泛化能力、对标签噪声的鲁棒性以及聚类质量。