Deep neural networks (DNNs) have achieved exceptional performance across various fields by learning complex nonlinear mappings from large-scale datasets. However, they encounter challenges such as high computational costs and limited interpretability. To address these issues, hybrid approaches that integrate physics with AI are gaining interest. This paper introduces a novel physics-based AI model called the "Nonlinear Schr\"odinger Network", which treats the Nonlinear Schr\"odinger Equation (NLSE) as a general-purpose trainable model for learning complex patterns including nonlinear mappings and memory effects from data. Existing physics-informed machine learning methods use neural networks to approximate the solutions of partial differential equations (PDEs). In contrast, our approach directly treats the PDE as a trainable model to obtain general nonlinear mappings that would otherwise require neural networks. As a physics-inspired approach, it offers a more interpretable and parameter-efficient alternative to traditional black-box neural networks, achieving comparable or better accuracy in time series classification tasks while significantly reducing the number of required parameters. Notably, the trained Nonlinear Schr\"odinger Network is interpretable, with all parameters having physical meanings as properties of a virtual physical system that transforms the data to a more separable space. This interpretability allows for insight into the underlying dynamics of the data transformation process. Applications to time series forecasting have also been explored. While our current implementation utilizes the NLSE, the proposed method of using physics equations as trainable models to learn nonlinear mappings from data is not limited to the NLSE and may be extended to other master equations of physics.
翻译:深度神经网络通过从大规模数据集中学习复杂的非线性映射,已在多个领域取得了卓越的性能。然而,它们也面临着计算成本高、可解释性有限等挑战。为解决这些问题,将物理学与人工智能相结合的混合方法正受到越来越多的关注。本文提出了一种新颖的基于物理的人工智能模型,称为“非线性薛定谔网络”。该模型将非线性薛定谔方程视为一种通用的可训练模型,用于从数据中学习包括非线性映射和记忆效应在内的复杂模式。现有的物理信息机器学习方法使用神经网络来近似偏微分方程的解。与之相反,我们的方法直接将偏微分方程作为可训练模型,以获得通常需要神经网络才能实现的一般非线性映射。作为一种受物理启发的方案,它相比传统的黑盒神经网络提供了更具可解释性和参数效率的替代方案,在时间序列分类任务中达到了相当或更优的准确率,同时显著减少了所需参数的数量。值得注意的是,训练后的非线性薛定谔网络是可解释的,其所有参数都具有物理意义,代表了一个将数据转换到更易分离空间的虚拟物理系统的属性。这种可解释性有助于深入理解数据转换过程的底层动力学。本文也探讨了其在时间序列预测中的应用。虽然我们当前的实现基于非线性薛定谔方程,但所提出的使用物理方程作为可训练模型以从数据中学习非线性映射的方法并不局限于该方程,未来可扩展至物理学的其他主方程。