This paper proposes a new approach to Machine Learning (ML) that focuses on unsupervised continuous context-dependent learning of complex patterns. Although the proposal is partly inspired by some of the current knowledge about the structural and functional properties of the mammalian brain, we do not claim that biological systems work in an analogous way (nor the opposite). Based on some properties of the cerebellar cortex and adjacent structures, a proposal suitable for practical problems is presented. A synthetic structure capable of identifying and predicting complex temporal series will be defined and experimentally tested. The system relies heavily on prediction to help identify and learn patterns based on previously acquired contextual knowledge. As a proof of concept, the proposed system is shown to be able to learn, identify and predict a remarkably complex temporal series such as human speech, with no prior knowledge. From raw data, without any adaptation in the core algorithm, the system is able to identify certain speech structures from a set of Spanish sentences. Unlike conventional ML, the proposal can learn with a reduced training set. Although the idea can be applied to a constrained problem, such as the detection of unknown vocabulary in a speech, it could be used in more applications, such as vision, or (by incorporating the missing biological periphery) fit into other ML techniques. Given the trivial computational primitives used, a potential hardware implementation will be remarkably frugal. Coincidentally, the proposed model not only conforms to a plausible functional framework for biological systems but may also explain many elusive cognitive phenomena.
翻译:本文提出了一种新的机器学习方法,聚焦于复杂模式的非监督连续上下文相关学习。尽管该方案部分受哺乳动物大脑结构与功能特性的现有认知启发,但我们未声称生物系统以类似方式运作(亦未持相反观点)。基于小脑皮层及邻近结构的若干特性,本文提出了一种适用于实际问题的方案,定义并实验验证了一种能够识别与预测复杂时间序列的合成结构。该系统高度依赖预测机制,借助先前获取的上下文知识辅助模式识别与学习。作为概念验证,所提系统在无先验知识条件下,被证明能够学习、识别并预测如人类语音般极为复杂的时间序列。通过原始数据输入,核心算法无需任何调整,即可从一组西班牙语句子中识别特定语音结构。与传统机器学习不同,该方案仅需少量训练集即可完成学习。尽管该思想可应用于有限问题(如语音中未知词汇的检测),但其潜在应用范围更广,涵盖视觉等领域,或通过整合缺失的生物外围结构,融入其他机器学习技术。由于采用极简计算原语,其潜在硬件实现将极为经济。巧合的是,所提模型不仅符合生物系统的合理功能框架,还可能解释诸多难以捉摸的认知现象。