We analyze connections between two low rank modeling approaches from the last decade for treating dynamical data. The first one is the coherence problem (or coherent set approach), where groups of states are sought that evolve under the action of a stochastic transition matrix in a way maximally distinguishable from other groups. The second one is a low rank factorization approach for stochastic matrices, called Direct Bayesian Model Reduction (DBMR), which estimates the low rank factors directly from observed data. We show that DBMR results in a low rank model that is a projection of the full model, and exploit this insight to infer bounds on a quantitative measure of coherence within the reduced model. Both approaches can be formulated as optimization problems, and we also prove a bound between their respective objectives. On a broader scope, this work relates the two classical loss functions of nonnegative matrix factorization, namely the Frobenius norm and the generalized Kullback--Leibler divergence, and suggests new links between likelihood-based and projection-based estimation of probabilistic models.
翻译:我们分析了近十年来处理动态数据的两种低秩建模方法之间的联系。第一种是相干性问题(或称相干集方法),其目标是寻找在随机转移矩阵作用下演化方式与其他群体最大程度可区分的状态群组。第二种是随机矩阵的低秩分解方法,称为直接贝叶斯模型降维(DBMR),该方法直接从观测数据估计低秩因子。我们证明DBMR得到的低秩模型是全模型的投影,并利用这一见解推导降维模型内相干性定量度量值的边界。两种方法均可表述为优化问题,我们还证明了它们各自目标函数之间的边界关系。在更广泛的意义上,本研究关联了非负矩阵分解的两种经典损失函数——弗罗贝尼乌斯范数和广义Kullback-Leibler散度,并为概率模型的基于似然估计和基于投影估计建立了新的理论联系。