We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation (PDE) which minimizes the distance between the PDE-solution and a given desired state. In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of $H^1_{loc}(\mathbb{R})$. This makes the studied problem class a suitable point of departure for the rigorous analysis of training problems for learning-informed PDEs in which an unknown superposition operator is approximated by means of a neural network with nonsmooth activation functions (ReLU, leaky-ReLU, etc.). We establish that, despite the low regularity of the controls, it is possible to derive a classical stationarity system for local minimizers and to solve the considered problem by means of a gradient projection method. The convergence of the resulting algorithm is proven in the function space setting. It is also shown that the established first-order necessary optimality conditions imply that locally optimal superposition operators share various characteristic properties with commonly used activation functions: They are always sigmoidal, continuously differentiable away from the origin, and typically possess a distinct kink at zero. The paper concludes with numerical experiments which confirm the theoretical findings.
翻译:本文研究了一个无穷维优化问题,旨在识别原型半线性椭圆偏微分方程(PDE)非线性部分中的涅梅茨基算子,该算子用于最小化PDE解与给定目标状态之间的距离。与以往工作不同,本文在低正则性框架下考虑该识别问题,其中诱导涅梅茨基算子的函数先验仅知属于$H^1_{loc}(\mathbb{R})$。这使得所研究的问题类别成为分析基于学习信息的PDE训练问题的合适起点——在此类问题中,未知叠加算子通过具有非光滑激活函数(如ReLU、Leaky-ReLU等)的神经网络近似。我们证明:尽管控制变量具有低正则性,仍可推导出局部极小点的经典驻定系统,并通过梯度投影法求解所考虑的问题。在函数空间设定下证明了该算法的收敛性。此外,建立的一阶最优性必要条件表明,局部最优叠加算子与常用激活函数共享多种特征性质:它们始终具有S型性质,在原点外连续可微,且通常在零点处具有明显的扭结。论文最后通过数值实验验证了理论结果。