Assessing Linguistic Generalisation in Language Models: A Dataset for Brazilian Portuguese

Much recent effort has been devoted to creating large-scale language models. Nowadays, the most prominent approaches are based on deep neural networks, such as BERT. However, they lack transparency and interpretability, and are often seen as black boxes. This affects not only their applicability in downstream tasks but also the comparability of different architectures or even of the same model trained using different corpora or hyperparameters. In this paper, we propose a set of intrinsic evaluation tasks that inspect the linguistic information encoded in models developed for Brazilian Portuguese. These tasks are designed to evaluate how different language models generalise information related to grammatical structures and multiword expressions (MWEs), thus allowing for an assessment of whether the model has learned different linguistic phenomena. The dataset that was developed for these tasks is composed of a series of sentences with a single masked word and a cue phrase that helps in narrowing down the context. This dataset is divided into MWEs and grammatical structures, and the latter is subdivided into 6 tasks: impersonal verbs, subject agreement, verb agreement, nominal agreement, passive and connectors. The subset for MWEs was used to test BERTimbau Large, BERTimbau Base and mBERT. For the grammatical structures, we used only BERTimbau Large, because it yielded the best results in the MWE task.

翻译：近年来，大量研究致力于构建大规模语言模型。目前，最突出的方法基于深度神经网络，如BERT。然而，这些模型缺乏透明度和可解释性，常被视为黑箱。这不仅影响其在下游任务中的应用能力，也阻碍了不同架构或同一模型在不同语料库或超参数下训练结果的可比性。本文提出一组内在评估任务，用于检测专为巴西葡萄牙语开发的模型中编码的语言信息。这些任务旨在评估不同语言模型如何泛化与语法结构和多词表达（MWEs）相关的信息，从而判断模型是否习得了不同的语言现象。为此，我们开发了一个数据集，包含一系列含单个掩码词的句子及一个有助于缩小上下文范围的提示短语。该数据集分为MWEs和语法结构两部分，后者进一步细分为6项任务：非人称动词、主谓一致、动词一致、名词一致、被动语态和连接词。MWEs子集用于测试BERTimbau Large、BERTimbau Base和mBERT。对于语法结构，我们仅使用BERTimbau Large，因为它在MWE任务中表现最佳。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日