Spiking Neural Networks (SNN). SNNs are based on a more biologically inspired approach than usual artificial neural networks. Such models are characterized by complex dynamics between neurons and spikes. These are very sensitive to the hyperparameters, making their optimization challenging. To tackle hyperparameter optimization of SNNs, we initially extended the signal loss issue of SNNs to what we call silent networks. These networks fail to emit enough spikes at their outputs due to mistuned hyperparameters or architecture. Generally, search spaces are heavily restrained, sometimes even discretized, to prevent the sampling of such networks. By defining an early stopping criterion detecting silent networks and by designing specific constraints, we were able to instantiate larger and more flexible search spaces. We applied a constrained Bayesian optimization technique, which was asynchronously parallelized, as the evaluation time of a SNN is highly stochastic. Large-scale experiments were carried-out on a multi-GPU Petascale architecture. By leveraging silent networks, results show an acceleration of the search, while maintaining good performances of both the optimization algorithm and the best solution obtained. We were able to apply our methodology to two popular training algorithms, known as spike timing dependent plasticity and surrogate gradient. Early detection allowed us to prevent worthless and costly computation, directing the search toward promising hyperparameter combinations. Our methodology could be applied to multi-objective problems, where the spiking activity is often minimized to reduce the energy consumption. In this scenario, it becomes essential to find the delicate frontier between low-spiking and silent networks. Finally, our approach may have implications for neural architecture search, particularly in defining suitable spiking architectures.
翻译:脉冲神经网络(SNN)是基于比传统人工神经网络更具生物启发性的方法构建的模型。这类模型以神经元与脉冲之间的复杂动力学为特征,对超参数高度敏感,因此其优化极具挑战性。为解决SNN的超参数优化问题,我们首先将SNN的信号丢失问题扩展为所谓的静默网络。这些网络因超参数或架构失配而无法在输出端发射足够数量的脉冲。通常,搜索空间会被严格限制(甚至离散化)以防止采样此类网络。通过定义检测静默网络的早停准则并设计特定约束,我们得以实例化更大且更灵活的搜索空间。我们采用了异步并行化的约束贝叶斯优化技术,因为SNN的评估时间具有高度随机性。我们在多GPU百亿亿次级架构上开展了大规模实验。通过利用静默网络,结果表明搜索过程得到加速,同时优化算法与最佳求解方案均保持良好性能。我们成功将该方法应用于两种主流训练算法——脉冲时序依赖可塑性与替代梯度法。早期检测使我们能够避免无价值且昂贵的计算,将搜索引导至有前景的超参数组合。该方法可推广至多目标问题,此类问题中通常需要最小化脉冲活动以降低能耗。在此场景下,找到低脉冲与静默网络之间的微妙边界至关重要。最后,我们的方法可能对神经架构搜索(特别是定义合适的脉冲架构)具有启示意义。