Implications of uncertain objective functions and permutative symmetry of traditional deep learning architectures are discussed. It is shown that traditional architectures are polluted by an astronomical number of equivalent global and local optima. Uncertainty of the objective makes local optima unattainable, and, as the size of the network grows, the global optimization landscape likely becomes a tangled web of valleys and ridges. Some remedies which reduce or eliminate ghost optima are discussed including forced pre-pruning, re-ordering, ortho-polynomial activations, and modular bio-inspired architectures.
翻译:本文探讨了传统深度学习架构中目标函数的不确定性及置换对称性的影响。研究表明,传统架构受到天文数量级的等效全局与局部最优解的干扰。目标函数的不确定性导致局部最优解无法达到,且随着网络规模增大,全局优化景观可能演变为由谷脊交错构成的复杂网状结构。文中讨论了若干减少或消除伪最优解的改进方案,包括强制预剪枝、重排序、正交多项式激活函数以及模块化仿生架构。