While deep learning has achieved remarkable success, there is no clear explanation about why it works so well. In order to discuss this question quantitatively, we need a mathematical framework that explains what learning is in the first place. After several considerations, we succeeded in constructing a mathematical framework that can provide a unified understanding of all types of learning, including deep learning and learning in the brain. We call it learning principle, and it follows that all learning is equivalent to estimating the probability of input data. We not only derived this principle, but also mentioned its application to actual machine learning models. For example, we found that conventional supervised learning is equivalent to estimating conditional probabilities, and succeeded in making supervised learning more effective and generalized. We also proposed a new method of defining the values of estimated probability using differentiation, and showed that unsupervised learning can be performed on arbitrary dataset without any prior knowledge. Namely, this method is a general-purpose machine learning in the true sense. Moreover, we succeeded in describing the learning mechanism in the brain by considering the time evolution of a fully or partially connected model and applying this new method. The learning principle provides solutions to many unsolved problems in deep learning and cognitive neuroscience.
翻译:虽然深度学习取得了显著成功,但其为何如此有效尚无明确解释。为了定量探讨这一问题,我们首先需要一个能够阐明学习本质的数学框架。经过反复思考,我们成功构建了一个能够统一理解包括深度学习与大脑学习在内的所有类型学习的数学框架,并将其称为“学习原理”。该原理指出,一切学习本质上等价于对输入数据概率的估计。我们不仅推导出这一原理,还阐述了其在机器学习实际模型中的应用。例如,我们发现传统监督学习等价于条件概率估计,并成功提升了监督学习的有效性与泛化能力。此外,我们提出了一种通过微分定义估计概率值的新方法,表明无需任何先验知识即可对任意数据集进行无监督学习——这本质上是一种真正的通用机器学习方法。更重要的是,通过考虑全连接或部分连接模型的时间演化并应用这一新方法,我们成功描述了大脑中的学习机制。该学习原理为深度学习与认知神经科学中诸多未解难题提供了解决方法。