Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly compared to the activation degrees. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. We also settle an open conjecture on the dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach the expected dimension.
翻译:多项式神经网络(PNNs)具有丰富的代数与几何结构。然而,其可辨识性——确保可解释性的关键属性——至今仍未得到充分理解。本文对深度PNNs的可辨识性进行了全面分析,涵盖了包含偏置项与不包含偏置项的架构。我们的结果揭示了在实现可辨识性时,激活次数与层宽度之间存在复杂的相互作用。作为特例,我们证明了在温和条件下,层宽度非递增的架构通常是可辨识的;而对于编码器-解码器网络,当解码器宽度相对于激活次数增长不太快时,其也是可辨识的。我们的证明是构造性的,核心在于建立深度PNNs与低秩张量分解以及Kruskal型唯一性定理之间的联系。此外,我们解决了一个关于PNN神经簇维度的公开猜想,并给出了其达到预期维度所需激活次数的新边界。