To gain a deeper understanding of the behavior and learning dynamics of (deep) artificial neural networks, it is valuable to employ mathematical abstractions and models. These tools provide a simplified perspective on network performance and facilitate systematic investigations through simulations. In this paper, we propose utilizing the framework of stochastic processes, which has been underutilized thus far. Our approach models activation patterns of thresholded nodes in (deep) artificial neural networks as stochastic processes. We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains. During a classification task, we extract spiking activity and use an arrival process following the Poisson distribution. We examine observed data from various artificial neural networks in image recognition tasks, fitting the proposed model's assumptions. Through this, we derive parameters describing activation patterns in each network. Our analysis covers randomly initialized, generalizing, and memorizing networks, revealing consistent differences across architectures and training sets. Calculating Mean Firing Rate, Mean Fano Factor, and Variances, we find stable indicators of memorization during learning, providing valuable insights into network behavior. The proposed model shows promise in describing activation patterns and could serve as a general framework for future investigations. It has potential applications in theoretical simulations, pruning, and transfer learning.
翻译:为深入理解(深度)人工神经网络的行为与学习动态,采用数学抽象与模型具有重要价值。这些工具为网络性能提供简化视角,并通过模拟促进系统性研究。本文提出利用随机过程框架(该框架迄今未被充分利用)。我们的方法将(深度)人工神经网络中阈值化节点的激活模式建模为随机过程。借鉴用于真实神经元脉冲序列的神经科学技术,我们仅聚焦于激活频率。在分类任务中,我们提取尖峰活动,并采用服从泊松分布的到达过程。针对图像识别任务中多种人工神经网络的观测数据,我们拟合所提出模型的假设,进而推导出描述各网络激活模式的参数。我们的分析涵盖随机初始化网络、泛化网络与记忆网络,揭示了不同架构及训练集间存在的稳定差异。通过计算平均发放率、平均Fano因子与方差,我们发现学习过程中指示记忆化的稳定指标,为网络行为提供了宝贵见解。所提出模型在描述激活模式方面具有潜力,可作为未来研究的通用框架,并可能应用于理论模拟、剪枝与迁移学习。