Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly compared to the activation degrees. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. We also settle an open conjecture on the dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach the expected dimension.
翻译:多项式神经网络(PNNs)具有丰富的代数与几何结构。然而,其辨识性——确保可解释性的关键性质——仍鲜为人知。本文对深度PNNs的辨识性进行了全面分析,包括含偏置项与不含偏置项的架构。我们的研究结果揭示了在实现辨识性过程中,激活度数与层宽度之间复杂的相互作用。作为特例,我们证明了在温和条件下,层宽度非递增的架构通常是可辨识的;而当解码器宽度相较于激活度数增长不过快时,编码器-解码器网络是可辨识的。我们的证明是构造性的,核心在于建立深度PNNs与低秩张量分解以及Kruskal型唯一性定理之间的联系。此外,我们解决了关于PNN神经簇维数的一个公开猜想,并给出了其达到预期维数所需激活度数的新上界。