As the foundation of current natural language processing methods, pre-trained language model has achieved excellent performance. However, the black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process. After revisiting the coupled requirement of deep neural representation and semantics logic of language modeling, a Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic. Moreover, a clustering process is also designed to connect the word- and context-level semantics. Specifically, an associative knowledge network (AKN), considered interpretable statistical logic, is introduced in the alignment process for word-level semantics. Furthermore, the context-relative distance is employed as the semantic feature for the downstream classifier, which is greatly different from the current uninterpretable semantic representations of pre-trained models. Our experiments for performance evaluation and interpretable analysis are executed on several types of datasets, including SIGHAN, Weibo, and ChnSenti. Wherein a novel evaluation strategy for the interpretability of machine learning models is first proposed. According to the experimental results, our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
翻译:当前自然语言处理方法的基础——预训练语言模型已取得卓越性能。然而,预训练语言模型中深度神经网络的黑箱结构严重制约了语言建模过程的可解释性。通过重新审视深度神经表征与语言建模语义逻辑的耦合需求,本文提出词语-上下文耦合空间(W2CSpace),通过引入不可解释神经表征与可解释统计逻辑的对齐处理实现该目标。此外,设计了连接词汇级与上下文级语义的聚类过程。具体而言,在词汇级语义的对齐过程中引入可解释统计逻辑——联想知识网络(AKN)。进一步地,采用上下文相关距离作为下游分类器的语义特征,这与当前预训练模型不可解释的语义表征形成显著差异。我们在SIGHAN、Weibo和ChnSenti等多类数据集上开展了性能评估与可解释性分析实验,并首次提出针对机器学习模型可解释性的新型评估策略。实验结果表明,相较现有最优方法,本语言模型能获得更优性能与高度可信的可解释能力。