Understanding how infants perceive speech sounds and language structures is still an open problem. Previous research in artificial neural networks has mainly focused on large dataset-dependent generative models, aiming to replicate language-related phenomena such as ''perceptual narrowing''. In this paper, we propose a novel approach using a small-sized generative neural network equipped with a continual learning mechanism based on predictive coding for mono-and bilingual speech sound learning (referred to as language sound acquisition during ''critical period'') and a compositional optimization mechanism for generation where no learning is involved (later infancy sound imitation). Our model prioritizes interpretability and demonstrates the advantages of online learning: Unlike deep networks requiring substantial offline training, our model continuously updates with new data, making it adaptable and responsive to changing inputs. Through experiments, we demonstrate that if second language acquisition occurs during later infancy, the challenges associated with learning a foreign language after the critical period amplify, replicating the perceptual narrowing effect.
翻译:理解婴儿如何感知语音和语言结构仍是一个开放性问题。先前关于人工神经网络的研究主要集中于依赖大规模数据集的生成模型,旨在复现诸如"感知窄化"等语言相关现象。本文提出一种新方法,使用配备持续学习机制的小型生成神经网络,该机制基于预测编码实现单语与双语语音学习(即"关键期"内的语言语音习得),并结合无需学习的组合优化机制进行生成(对应后期婴儿语音模仿)。我们的模型以可解释性为优先,并展示了在线学习的优势:与需要大量离线训练的深度网络不同,本模型能随新数据持续更新,从而适应并响应变化的输入。通过实验证明,若第二语言习得发生在婴儿后期,关键期后学习外语的挑战将显著加剧,这复现了感知窄化效应。