Understanding the latent space of language models (LM) is crucial to refining their performance and interpretability. Existing analyses often fall short in providing disentangled (model-centric) insights into LM semantics, and neglect essential aspects of LM adaptation. In response, we introduce a pioneering method called vocabulary-defined semantics, which establishes a reference frame within the LM latent space, ensuring disentangled semantic analysis grounded in LM vocabulary. Our approach transcends prior entangled analysis, leveraging LM vocabulary for model-centric insights. Furthermore, we propose a novel technique to compute logits, emphasising differentiability and local isotropy, and introduce a neural clustering module for semantically calibrating data representations during LM adaptation. Through extensive experiments across diverse text understanding datasets, our approach outperforms state-of-the-art methods of retrieval-augmented generation and parameter-efficient finetuning, showcasing its efficacy and broad applicability. Our findings not only shed light on LM mechanics, but also offer practical solutions to enhance LM performance and interpretability.
翻译:理解语言模型(LM)的潜空间,对于优化其性能与可解释性至关重要。现有分析通常无法提供关于LM语义的(以模型为中心的)解耦性洞察,且忽略了LM适配的关键方面。为此,我们提出一种开创性方法——词汇定义语义,该方法在LM潜空间内建立参考框架,确保基于LM词汇进行解耦的语义分析。我们的方法超越先前纠缠分析,利用LM词汇获取以模型为中心的洞察。此外,我们提出一种计算逻辑值的新技术,强调可微性与局部各向同性,并引入神经聚类模块,用于在LM适配过程中对数据表示进行语义校准。通过跨多种文本理解数据集的大量实验,我们的方法在检索增强生成与参数高效微调方面均优于现有最先进方法,展示了其有效性与广泛适用性。研究成果不仅揭示了LM运行机制,还为提升LM性能与可解释性提供了实用解决方案。