Streaming Operator Inference for Model Reduction of Large-Scale Dynamical Systems

Projection-based model reduction enables efficient simulation of complex dynamical systems by constructing low-dimensional surrogate models from high-dimensional data. The Operator Inference (OpInf) approach learns such reduced surrogate models through a two-step process: constructing a low-dimensional basis via Singular Value Decomposition (SVD) to compress the data, then solving a linear least-squares (LS) problem to infer reduced operators that govern the dynamics in this compressed space, all without access to the underlying code or full model operators, i.e., non-intrusively. Traditional OpInf operates as a batch learning method, where both the SVD and LS steps process all data simultaneously. This poses a barrier to deployment of the approach on large-scale applications where dataset sizes prevent the loading of all data into memory at once. Additionally, the traditional batch approach does not naturally allow model updates using new data acquired during online computation. To address these limitations, we propose Streaming OpInf, which learns reduced models from sequentially arriving data streams. Our approach employs incremental SVD for adaptive basis construction and recursive LS for streaming operator updates, eliminating the need to store complete data sets while enabling online model adaptation. The approach can flexibly combine different choices of streaming algorithms for numerical linear algebra: we systematically explore the impact of these choices both analytically and numerically to identify effective combinations for accurate reduced model learning. Numerical experiments on benchmark problems and a large-scale turbulent channel flow demonstrate that Streaming OpInf achieves accuracy comparable to batch OpInf while reducing memory requirements by over 99% and enabling dimension reductions exceeding 31,000x, resulting in orders-of-magnitude faster predictions.

翻译：基于投影的模型降阶技术通过从高维数据构建低维代理模型，实现对复杂动力系统的高效仿真。算子推断（OpInf）方法通过两步流程学习此类降阶代理模型：首先通过奇异值分解（SVD）构建低维基以压缩数据，随后求解线性最小二乘（LS）问题来推断支配压缩空间动力学的降阶算子——整个过程无需访问底层代码或完整模型算子，即非侵入式实现。传统OpInf作为批量学习方法运行，其SVD与LS步骤均需同时处理全部数据。这导致该方法难以部署于大规模应用场景，因为数据集规模可能使内存无法一次性加载全部数据。此外，传统批量方法无法自然支持利用在线计算获得的新数据进行模型更新。为突破这些限制，我们提出流式算子推断方法，可从顺序到达的数据流中学习降阶模型。该方法采用增量SVD实现自适应基构造，并利用递归LS进行流式算子更新，从而在支持在线模型适配的同时消除存储完整数据集的需求。本方法可灵活组合数值线性代数中不同的流式算法选择：我们通过系统性的理论分析与数值实验，探究这些选择的影响以确定实现精确降阶模型学习的有效组合。在基准问题和大规模湍流通道流动上的数值实验表明，流式算子推断在达到与批量OpInf相当精度的同时，可降低超过99%的内存需求，实现超过31,000倍的维度缩减，从而获得数量级级别的预测加速。