We consider the problem of estimating the parameters of a Markov Random Field with hard-constraints using a single sample. As our main running examples, we use the $k$-SAT and the proper coloring models, as well as general $H$-coloring models; for all of these we obtain both positive and negative results. In contrast to the soft-constrained case, we show in particular that single-sample estimation is not always possible, and that the existence of an estimator is related to the existence of non-satisfiable instances. Our algorithms are based on the pseudo-likelihood estimator. We show variance bounds for this estimator using coupling techniques inspired, in the case of $k$-SAT, by Moitra's sampling algorithm (JACM, 2019); our positive results for colorings build on this new coupling approach. For $q$-colorings on graphs with maximum degree $d$, we give a linear-time estimator when $q>d+1$, whereas the problem is non-identifiable when $q\leq d+1$. For general $H$-colorings, we show that standard conditions that guarantee sampling, such as Dobrushin's condition, are insufficient for one-sample learning; on the positive side, we provide a general condition that is sufficient to guarantee linear-time learning and obtain applications for proper colorings and permissive models. For the $k$-SAT model on formulas with maximum degree $d$, we provide a linear-time estimator when $k\gtrsim 6.45\log d$, whereas the problem becomes non-identifiable when $k\lesssim \log d$.
翻译:我们研究了利用单样本估计具有硬约束的马尔可夫随机场参数的问题。作为主要示例,我们考虑了$k$-SAT模型、适定染色模型以及一般性的$H$-染色模型;针对所有这些模型,我们得到了正反两方面的结果。与软约束情形相比,我们特别指出单样本估计并非总是可行的,且估计器的存在性与不可满足实例的存在性相关。我们的算法基于伪似然估计器。通过结合技术(在$k$-SAT情形下受Moitra采样算法(JACM, 2019)启发),我们给出了该估计器的方差界;针对染色问题的正面结果则建立在这种新的结合方法之上。对于最大度为$d$的图上的$q$-染色问题,当$q>d+1$时,我们给出了线性时间估计器,而当$q\leq d+1$时,问题不可识别。对于一般性的$H$-染色模型,我们指出保证采样的标准条件(如Dobrushin条件)不足以实现单样本学习;正面地,我们给出了一个充分条件,可保证线性时间学习,并得到了适定染色模型和宽松模型的应用。对于最大度为$d$的公式上的$k$-SAT模型,当$k\gtrsim 6.45\log d$时,我们给出了线性时间估计器,而当$k\lesssim \log d$时,问题变得不可识别。