Probing the Category of Verbal Aspect in Transformer Language Models

We investigate how pretrained language models (PLM) encode the grammatical category of verbal aspect in Russian. Encoding of aspect in transformer LMs has not been studied previously in any language. A particular challenge is posed by "alternative contexts": where either the perfective or the imperfective aspect is suitable grammatically and semantically. We perform probing using BERT and RoBERTa on alternative and non-alternative contexts. First, we assess the models' performance on aspect prediction, via behavioral probing. Next, we examine the models' performance when their contextual representations are substituted with counterfactual representations, via causal probing. These counterfactuals alter the value of the "boundedness" feature--a semantic feature, which characterizes the action in the context. Experiments show that BERT and RoBERTa do encode aspect--mostly in their final layers. The counterfactual interventions affect perfective and imperfective in opposite ways, which is consistent with grammar: perfective is positively affected by adding the meaning of boundedness, and vice versa. The practical implications of our probing results are that fine-tuning only the last layers of BERT on predicting aspect is faster and more effective than fine-tuning the whole model. The model has high predictive uncertainty about aspect in alternative contexts, which tend to lack explicit hints about the boundedness of the described action.

翻译：本研究旨在探究预训练语言模型（PLM）如何编码俄语中动词的体范畴。此前尚未有研究针对任何语言探讨Transformer语言模型中的体范畴编码问题。研究面临的一个特殊挑战在于“可替换语境”：即完成体或未完成体在语法和语义上均可适用的语境。我们基于BERT和RoBERTa模型，在可替换及非可替换语境中进行了探测分析。首先，通过行为探测评估模型在体范畴预测任务上的表现；其次，通过因果探测方法，将模型的上下文表征替换为反事实表征，考察其性能变化。这些反事实表征改变了“有界性”特征的值——该语义特征用于刻画语境中动作的属性。实验表明，BERT和RoBERTa确实编码了体范畴信息，且主要分布于模型最后几层。反事实干预对完成体和未完成体产生相反方向的影响，这与语法规律一致：增加有界性含义对完成体产生正向影响，反之则对未完成体产生正向影响。本探测研究的实践意义在于：仅对BERT最后几层进行体范畴预测的微调，相比微调整个模型更为高效且效果更优。模型在可替换语境中对体范畴的预测具有较高不确定性，这类语境往往缺乏对所述动作有界性的显式提示。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日