The study of the closest point(s) on a statistical model from a given distribution in the probability simplex with respect to a fixed Wasserstein metric gives rise to a polyhedral norm distance optimization problem. There are two components to the complexity of determining the Wasserstein distance from a data point to a model. One is the combinatorial complexity that is governed by the combinatorics of the Lipschitz polytope of the finite metric to be used. Another is the algebraic complexity, which is governed by the polar degrees of the Zariski closure of the model. We find formulas for the polar degrees of rational normal scrolls and graphical models whose underlying graphs are star trees. Also, the polar degrees of the graphical models with four binary random variables where the graphs are a path on four vertices and the four-cycle, as well as for small, no-three-way interaction models, were computed. We investigate the algebraic degree of computing the Wasserstein distance to a small subset of these models. It was observed that this algebraic degree is typically smaller than the corresponding polar degree.
翻译:在概率单纯形中,基于固定的Wasserstein度量,研究统计模型上距离给定分布最近的点,可归结为多面体范数距离优化问题。确定数据点到模型的Wasserstein距离的复杂度包含两个部分:一是组合复杂度,由所用有限度量的Lipschitz多面体的组合结构决定;二是代数复杂度,由模型的Zariski闭包的极次数决定。我们推导了有理正规卷绕模型及底层图为星树的图模型的极次数公式。同时,计算了具有四个二元随机变量的图模型(其图分别为四顶点路径与四环)以及小型无三阶交互模型的极次数。我们进一步考察了计算这些模型中一小部分子集的Wasserstein距离的代数次数,发现该代数次数通常小于对应的极次数。