The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic generalization. In this work, we focus on intrinsic probing, an analysis technique where the goal is not only to identify whether a representation encodes a linguistic attribute but also to pinpoint where this attribute is encoded. We propose a novel latent-variable formulation for constructing intrinsic probes and derive a tractable variational approximation to the log-likelihood. Our results show that our model is versatile and yields tighter mutual information estimates than two intrinsic probes previously proposed in the literature. Finally, we find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
翻译:预训练上下文表示的成功促使研究者分析其中是否包含语言学信息。事实上,自然可以假设这些预训练表示确实编码了某种程度的语言学知识,因为它们在各种自然语言处理任务上带来了巨大的经验性改进,这表明它们正在学习真正的语言学泛化。在本工作中,我们聚焦于内部探测这一分析技术,其目标不仅是识别表示是否编码了语言学属性,还要精确定位该属性的编码位置。我们提出了一种新颖的隐变量公式来构建内部探测器,并推导了对数似然的一个可处理的变分近似。我们的结果表明,该模型具有通用性,并且比文献中先前提出的两种内部探测器给出了更紧密的互信息估计。最后,我们发现了经验证据表明预训练表示发展出一种跨语言纠缠的形态句法概念。