Understanding the latent space of language models (LM) is crucial to refining their performance and interpretability. Existing analyses often fall short in providing disentangled (model-centric) insights into LM semantics, and neglect essential aspects of LM adaption. In response, we introduce a pioneering method called vocabulary-defined semantics, which establishes a reference frame within the LM latent space, ensuring disentangled semantic analysis grounded in LM vocabulary. Our approach transcends prior entangled analysis, leveraging LM vocabulary for model-centric insights. Furthermore, we propose a novel technique to compute logits, emphasising differentiability and local isotropy, and introduce a neural clustering module for semantically calibrating data representations during LM adaptation. Through extensive experiments across diverse text understanding datasets, our approach outperforms state-of-the-art methods of retrieval-augmented generation and parameter-efficient finetuning, showcasing its efficacy and broad applicability. Our findings not only shed light on LM mechanics, but also offer practical solutions to enhance LM performance and interpretability.
翻译:理解语言模型(LM)的潜在空间对于优化其性能与可解释性至关重要。现有分析往往难以提供关于LM语义的解耦性(以模型为中心)洞见,并忽视了LM适配的关键方面。为此,我们提出了一种开创性方法——词汇定义语义学,该方法在LM潜在空间内建立一个参考框架,确保基于LM词汇的解耦语义分析。我们的方法超越了以往纠缠态分析框架,利用LM词汇获取以模型为中心的见解。此外,我们提出了一种创新的logits计算技术,强调可微性与局部各向同性,并引入一种神经聚类模块,在LM适配过程中对数据表示进行语义校准。通过在多种文本理解数据集上进行广泛实验,我们的方法在检索增强生成与参数高效微调方面均超越了现有最先进技术,展现了其有效性与广泛适用性。研究结果不仅揭示了LM的运行机制,还为提升LM性能与可解释性提供了实用解决方案。