Human language acquisition is an efficient, supervised, and continual process. In this work, we took inspiration from how human babies acquire their first language, and developed a computational process for word acquisition through comparative learning. Motivated by cognitive findings, we generated a small dataset that enables the computation models to compare the similarities and differences of various attributes, learn to filter out and extract the common information for each shared linguistic label. We frame the acquisition of words as not only the information filtration process, but also as representation-symbol mapping. This procedure does not involve a fixed vocabulary size, nor a discriminative objective, and allows the models to continually learn more concepts efficiently. Our results in controlled experiments have shown the potential of this approach for efficient continual learning of grounded words.
翻译:人类语言习得是一个高效、受监督且持续的过程。本研究受人类婴儿习得母语的启发,开发了一种通过比较学习进行词汇习得的计算过程。基于认知科学的研究成果,我们生成了一个小型数据集,使计算模型能够比较不同属性的相似性与差异性,并学习筛选和提取每个共享语言标签的共有信息。我们将词汇习得不仅视为信息过滤过程,还视为表征-符号映射。这一过程不涉及固定词汇量或判别性目标,使模型能够持续高效地学习更多概念。控制实验的结果表明,该方法在高效持续学习基础词汇方面具有潜力。