Incorporating novelties into deep learning systems remains a challenging problem. Introducing new information to a machine learning system can interfere with previously stored data and potentially alter the global model paradigm, especially when dealing with non-stationary sources. In such cases, traditional approaches based on validation error minimization offer limited advantages. To address this, we propose a training algorithm inspired by Stuart Kauffman's notion of the Adjacent Possible. This novel training methodology explores new data spaces during the learning phase. It predisposes the neural network to smoothly accept and integrate data sequences with different statistical characteristics than expected. The maximum distance compatible with such inclusion depends on a specific parameter: the sampling temperature used in the explorative phase of the present method. This algorithm, called Dreaming Learning, anticipates potential regime shifts over time, enhancing the neural network's responsiveness to non-stationary events that alter statistical properties. To assess the advantages of this approach, we apply this methodology to unexpected statistical changes in Markov chains and non-stationary dynamics in textual sequences. We demonstrated its ability to improve the auto-correlation of generated textual sequences by $\sim 29\%$ and enhance the velocity of loss convergence by $\sim 100\%$ in the case of a paradigm shift in Markov chains.
翻译:将新颖性融入深度学习系统仍然是一个具有挑战性的问题。向机器学习系统引入新信息可能会干扰先前存储的数据,并可能改变全局模型范式,尤其是在处理非平稳源时。在这种情况下,基于验证误差最小化的传统方法提供的优势有限。为了解决这个问题,我们提出了一种受斯图尔特·考夫曼的“邻近可能”概念启发的训练算法。这种新颖的训练方法在学习阶段探索新的数据空间。它使神经网络倾向于平滑地接受和整合具有与预期不同的统计特性的数据序列。与此类整合兼容的最大距离取决于一个特定参数:本方法探索阶段使用的采样温度。这种被称为梦境学习的算法,能够预测随时间推移可能发生的范式转变,从而增强神经网络对改变统计特性的非平稳事件的响应能力。为了评估这种方法的优势,我们将其应用于马尔可夫链中的意外统计变化和文本序列中的非平稳动态。我们证明了该方法能够将生成的文本序列的自相关性提高约29%,并在马尔可夫链发生范式转变的情况下,将损失收敛速度提高约100%。