This paper considers the speed of convergence (mixing) of a finite Markov kernel $P$ with respect to the Kullback-Leibler divergence (entropy). Given a Markov kernel one defines either a discrete-time Markov chain (with the $n$-step transition kernel given by the matrix power $P^n$) or a continuous-time Markov process (with the time-$t$ transition kernel given by $e^{t(P-\mathrm{Id})}$). The contraction of entropy for $n=1$ or $t=0+$ are characterized by the famous functional inequalities, the strong data processing inequality (SDPI) and the modified log-Sobolev inequality (MLSI), respectively. When $P=KK^*$ is written as the product of a kernel and its adjoint, one could also consider the ``half-step'' contraction, which is the SDPI for $K$, while the ``full-step'' contraction refers to the SDPI for $P$. The work [DMLM03] claimed that these contraction coefficients (half-step, full-step, and continuous-time) are generally within a constant factor of each other. We disprove this and related conjectures by working out a number of different counterexamples. In particular, we construct (a) a continuous-time Markov process that contracts arbitrarily faster than its discrete-time counterpart; and (b) a kernel $P$ such that $P^{m+1}$ contracts arbitrarily better than $P^m$. Hence, our main conclusion is that the four standard inequalities comparing five common notions of entropy and variance contraction are generally not improvable. In the process of analyzing the counterexamples, we survey and sharpen the tools for bounding the contraction coefficients and characterize properties of extremizers of the respective functional inequalities. As our examples range from Bernoulli-Laplace model, random walks on graphs, to birth-death chains, the paper is also intended as a tutorial on computing MLSI, SDPI and other constants for these types of commonly occurring Markov chains.
翻译:本文研究有限马尔可夫核$P$相对于Kullback-Leibler散度(熵)的收敛(混合)速度。给定马尔可夫核,可定义离散时间马尔可夫链(其$n$步转移核由矩阵幂$P^n$给出)或连续时间马尔可夫过程(其$t$时刻转移核由$e^{t(P-\mathrm{Id})}$给出)。$n=1$或$t=0+$时的熵收缩特性分别由著名的函数不等式——强数据处理不等式(SDPI)与修正对数Sobolev不等式(MLSI)所刻画。当$P=KK^*$可表示为核与其伴随的乘积时,还可考虑“半步”收缩(即$K$的SDPI),而“整步”收缩则指$P$的SDPI。文献[DMLM03]曾断言这些收缩系数(半步、整步与连续时间)通常彼此相差常数倍。我们通过构建一系列反例否定了该猜想及相关推论。具体而言,我们构造了:(a)收缩速度任意快于对应离散时间版本的连续时间马尔可夫过程;(b)满足$P^{m+1}$收缩性任意优于$P^m$的核$P$。因此,本文主要结论为:比较五种常见熵与方差收缩概念的四类标准不等式通常不可改进。在分析反例的过程中,我们系统梳理并完善了界定收缩系数的工具,刻画了相应函数不等式极值子的性质。由于所举示例涵盖Bernoulli-Laplace模型、图随机游走及生灭链等类型,本文亦可作为计算此类常见马尔可夫链的MLSI、SDPI及其他常数的入门教程。