Fast sampling of satisfying assignments from random $k$-SAT with applications to connectivity

We give a nearly linear-time algorithm to approximately sample satisfying assignments in the random $k$-SAT model when the density of the formula scales exponentially with $k$. The best previously known sampling algorithm for the random $k$-SAT model applies when the density $\alpha=m/n$ of the formula is less than $2^{k/300}$ and runs in time $n^{\exp(\Theta(k))}$. Here $n$ is the number of variables and $m$ is the number of clauses. Our algorithm achieves a significantly faster running time of $n^{1 + o_k(1)}$ and samples satisfying assignments up to density $\alpha\leq 2^{0.039 k}$. The main challenge in our setting is the presence of many variables with unbounded degree, which causes significant correlations within the formula and impedes the application of relevant Markov chain methods from the bounded-degree setting. Our main technical contribution is a $o_k(\log n )$ bound of the sum of influences in the $k$-SAT model which turns out to be robust against the presence of high-degree variables. This allows us to apply the spectral independence framework and obtain fast mixing results of a uniform-block Glauber dynamics on a carefully selected subset of the variables. The final key ingredient in our method is to take advantage of the sparsity of logarithmic-sized connected sets and the expansion properties of the random formula, and establish relevant connectivity properties of the set of satisfying assignments that enable the fast simulation of this Glauber dynamics. Our results also allow us to conclude that, with high probability, a random $k$-CNF formula with density at most $2^{0.227 k}$ has a giant component of solutions that are connected in a graph where solutions are adjacent if they have Hamming distance $O_k(\log n)$. We are also able to deduce looseness results for random $k$-CNFs in the same regime.

翻译：针对密度随$k$呈指数级增长的随机$k$-SAT模型，我们提出了一种近似采样满足赋值的近线性时间算法。先前已知的最佳随机$k$-SAT采样算法适用于公式密度$\alpha=m/n$小于$2^{k/300}$的情形，其运行时间为$n^{\exp(\Theta(k))}$，其中$n$为变量数，$m$为子句数。我们的算法实现了显著更快的$n^{1 + o_k(1)}$运行时间，并能采样密度高达$\alpha\leq 2^{0.039 k}$的满足赋值。本研究所面临的主要挑战是存在大量无界度变量，这导致公式内部存在显著相关性，阻碍了有界度场景中相关马尔可夫链方法的直接应用。我们的核心技术贡献在于给出了$k$-SAT模型中影响量之和的$o_k(\log n )$界，该结果被证明对高度数变量的存在具有鲁棒性。这使得我们能够应用谱独立性框架，并在精心选择的变量子集上获得均匀块Glauber动力学的快速混合结果。方法的最终关键要素是利用对数规模连通集的稀疏性及随机公式的扩张性质，建立满足赋值集合的相关连通特性，从而实现对Glauber动力学的快速模拟。我们的结果还表明：当密度不超过$2^{0.227 k}$时，随机$k$-CNF公式高概率存在一个巨型解分量，该分量在解相邻图（当两个解的汉明距离为$O_k(\log n)$时视为相邻）中连通。在同一参数范围内，我们还能推导出随机$k$-CNF的松弛性结果。