A systematic investigation of learnability from single child linguistic input

Language models (LMs) have demonstrated remarkable proficiency in generating linguistically coherent text, sparking discussions about their relevance to understanding human language learnability. However, a significant gap exists between the training data for these models and the linguistic input a child receives. LMs are typically trained on data that is orders of magnitude larger and fundamentally different from child-directed speech (Warstadt and Bowman, 2022; Warstadt et al., 2023; Frank, 2023a). Addressing this discrepancy, our research focuses on training LMs on subsets of a single child's linguistic input. Previously, Wang, Vong, Kim, and Lake (2023) found that LMs trained in this setting can form syntactic and semantic word clusters and develop sensitivity to certain linguistic phenomena, but they only considered LSTMs and simpler neural networks trained from just one single-child dataset. Here, to examine the robustness of learnability from single-child input, we systematically train six different model architectures on five datasets (3 single-child and 2 baselines). We find that the models trained on single-child datasets showed consistent results that matched with previous work, underscoring the robustness of forming meaningful syntactic and semantic representations from a subset of a child's linguistic input.

翻译：语言模型在生成符合语言习惯的文本方面展现出卓越能力，引发了关于其与人类语言可学习性相关性的讨论。然而，这些模型的训练数据与儿童接收的语言输入之间存在显著差距。语言模型通常基于规模大数个数量级且与儿童导向语言存在本质差异的数据进行训练（Warstadt and Bowman, 2022; Warstadt et al., 2023; Frank, 2023a）。针对这一差异，我们的研究聚焦于在单个儿童语言输入的子集上训练语言模型。此前，Wang、Vong、Kim和Lake（2023）发现，在此设定下训练的语言模型能够形成句法和语义词汇聚类，并对特定语言现象产生敏感性，但该研究仅基于单一儿童数据集训练了LSTM及更简单的神经网络。为检验单一儿童输入可学习性的稳健性，我们系统性地在五个数据集（三个单一儿童数据集和两个基线数据集）上训练了六种不同架构的模型。结果表明，基于单一儿童数据集训练的模型呈现出一致结果，与先前研究吻合，这凸显了从儿童语言输入子集中形成有意义的句法与语义表征的稳健性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日