We analyze connections between two low rank modeling approaches from the last decade for treating dynamical data. The first one is the coherence problem (or coherent set approach), where groups of states are sought that evolve under the action of a stochastic matrix in a way maximally distinguishable from other groups. The second one is a low rank factorization approach for stochastic matrices, called Direct Bayesian Model Reduction (DBMR), which estimates the low rank factors directly from observed data. We show that DBMR results in a low rank model that is a projection of the full model, and exploit this insight to infer bounds on a quantitative measure of coherence within the reduced model. Both approaches can be formulated as optimization problems, and we also prove a bound between their respective objectives. On a broader scope, this work relates the two classical loss functions of nonnegative matrix factorization, namely the Frobenius norm and the generalized Kullback--Leibler divergence, and suggests new links between likelihood-based and projection-based estimation of probabilistic models.
翻译:我们分析了近十年来处理动态数据的两种低秩建模方法之间的联系。第一种是相干性问题(或称相干集方法),即寻找在随机矩阵作用下与其他组别具有最大可区分性的状态群组。第二种是随机矩阵的低秩分解方法,称为直接贝叶斯模型约简(DBMR),该方法直接从观测数据中估计低秩因子。我们证明DBMR产生的低秩模型是全模型的投影,并利用这一见解推断出约简模型中相干性定量测度的边界。两种方法均可表述为优化问题,我们还证明了它们各自目标函数之间的界限。从更广泛的视角看,本研究构建了非负矩阵分解的两种经典损失函数——弗罗贝尼乌斯范数与广义库尔贝克-莱布勒散度之间的关联,并揭示了基于似然和基于投影的概率模型估计之间的新联系。