Parameter estimation for Gibbs distributions

from arxiv, This is a significantly extended version of a paper "A Faster Approximation Algorithm for the Gibbs Partition Function" (arXiv:1608.04223), which was published in COLT 2018. It covers many additional topics; most importantly, algorithms to estimate counts and algorithm specialized for integer-valued distributions. arXiv admin note: text overlap with arXiv:1904.03139

We consider \emph{Gibbs distributions}, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H( \omega ) \in \{0 \} \cup [1, n]$. The \emph{partition function} is the normalization factor $Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs and perfect matchings in a graph. We develop a key subroutine to estimate the partition function $Z$. Specifically, it generates a data structure to estimate $Z(\beta)$ for \emph{all} values $\beta$, without further samples. Constructing the data structure requires $O(\frac{q \log n}{\varepsilon^2})$ samples for general Gibbs distributions and $O(\frac{n^2 \log n}{\varepsilon^2} + n \log q)$ samples for integer-valued distributions. This improves over a prior algorithm of Huber (2015) which computes a single point estimate $Z(\beta_\max)$ using $O( q \log n( \log q + \log \log n + \varepsilon^{-2}))$ samples. We show matching lower bounds, demonstrating that this complexity is optimal as a function of $n$ and $q$ up to logarithmic terms.

翻译：我们考虑\emph{吉布斯分布}，即定义在离散空间$\Omega$上的一类概率分布族，其概率质量函数形式为$\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$，其中$\beta$位于区间$[\beta_{\min}, \beta_{\max}]$，且$H( \omega ) \in \{0 \} \cup [1, n]$。\emph{配分函数}是归一化因子$Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$。这些分布的两个重要参数是对数配分比$q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$和计数$c_x = |H^{-1}(x)|$。在众多物理应用和采样算法中，这些参数与系统特性密切相关。我们的第一个主要结果是对一般吉布斯分布，使用约$\tilde O( \frac{q}{\varepsilon^2})$个样本估计计数$c_x$；对整数值分布，使用约$\tilde O( \frac{n^2}{\varepsilon^2} )$个样本（忽略部分二阶项和参数）。我们证明该复杂度在log因子意义下是最优的。我们通过改进图连通子图计数和完美匹配计数算法展示其应用。我们还开发了一个关键子程序来估计配分函数$Z$。具体而言，该子程序构建了一个数据结构，可在无需额外样本的情况下估计\emph{所有}$\beta$值的$Z(\beta)$。构建该数据结构对一般吉布斯分布需要$O(\frac{q \log n}{\varepsilon^2})$个样本，对整数值分布需要$O(\frac{n^2 \log n}{\varepsilon^2} + n \log q)$个样本。这改进了Huber（2015）的先前算法，后者使用$O( q \log n( \log q + \log \log n + \varepsilon^{-2}))$个样本仅计算单点估计$Z(\beta_\max)$。我们证明了匹配的下界，表明该复杂度作为$n$和$q$的函数在log因子意义下达到最优。