We study the parameterized complexity of training two-layer neural networks with respect to the dimension of the input data and the number of hidden neurons, considering ReLU and linear threshold activation functions. Albeit the computational complexity of these problems has been studied numerous times in recent years, several questions are still open. We answer questions by Arora et al. [ICLR '18] and Khalife and Basu [IPCO '22] showing that both problems are NP-hard for two dimensions, which excludes any polynomial-time algorithm for constant dimension. We also answer a question by Froese et al. [JAIR '22] proving W[1]-hardness for four ReLUs (or two linear threshold neurons) with zero training error. Finally, in the ReLU case, we show fixed-parameter tractability for the combined parameter number of dimensions and number of ReLUs if the network is assumed to compute a convex map. Our results settle the complexity status regarding these parameters almost completely.
翻译:我们研究了关于输入数据维度和隐藏神经元数量的两层神经网络训练的参数化复杂性,考虑了ReLU和线性阈值激活函数。尽管近年来这些问题已被多次研究其计算复杂性,但仍存在若干未解问题。我们回答了Arora等人[ICLR '18]以及Khalife和Basu [IPCO '22]提出的问题,证明这两个问题在二维情况下均为NP-困难,从而排除了常数维度下任何多项式时间算法的可能性。我们还回答了Froese等人[JAIR '22]的问题,证明了在零训练误差下,四个ReLU(或两个线性阈值神经元)的情况为W[1]-困难。最后,在ReLU情形下,我们证明如果假设网络计算凸映射,则维度数与ReLU数量的组合参数具有固定参数可解性。我们的结果几乎完全解决了这些参数相关的复杂性状态。