Selecting a finite dictionary of observables whose span is Koopman-invariant is a central challenge in data-driven Koopman operator approximation. We address this problem by exploiting zero-block structure in Extended Dynamic Mode Decomposition (EDMD) matrices. We show that any sub-dictionary whose span is Koopman-invariant induces an exact zero block in the EDMD matrix, even for finite data. We then show that such blocks can be detected by applying PageRank to a row-normalized EDMD matrix constructed from a large initial dictionary. The theory extends to approximately invariant subspaces and yields stronger guarantees for personalized PageRank (PPR) when the seed observables lie inside the target block and reach all observables in that block. Combining EDMD concentration bounds with PageRank perturbation theory gives end-to-end detection guarantees with $O(1/\sqrt{M})$ finite-sample scaling and explicit constants. More generally, without assuming an invariant subspace exists, high PPR mass on a sub-dictionary controls discounted multi-step leakage from the seed observables. Numerical experiments on the Duffing oscillator, Van der Pol oscillator, Lorenz system, and a three-well Ramachandran potential suggest that the method identifies compact, interpretable dictionaries with accurate predictions.
翻译:选择一组有限的可观测函数字典,使其张成空间为Koopman不变的,是数据驱动型Koopman算子近似的核心挑战。我们通过利用扩展动态模式分解(EDMD)矩阵中的零块结构来解决这一问题。研究表明,任何张成空间为Koopman不变的子字典都会在EDMD矩阵中诱导出一个精确的零块,即使在有限数据情况下也是如此。我们进一步证明,这些零块可以通过对由大型初始字典构造的行归一化EDMD矩阵应用PageRank算法来检测。该理论可扩展到近似不变子空间,并在种子可观测量位于目标块内且能到达该块中所有可观测量时,为个性化PageRank(PPR)提供更强的保证。将EDMD集中界与PageRank扰动理论相结合,可提供端到端的检测保证,具有$O(1/\sqrt{M})$的有限样本缩放比例和显式常数。更一般地,在不假设存在不变子空间的情况下,子字典上的高PPR质量可控制从种子可观测量的折现多步泄漏。在Duffing振荡器、Van der Pol振荡器、Lorenz系统以及三阱Ramachandran势能上的数值实验表明,该方法能够识别出紧凑且可解释的字典,并实现精确的预测。