The Boolean Satisfiability (SAT) problem stands out as an attractive NP-complete problem in theoretic computer science and plays a central role in a broad spectrum of computing-related applications. Exploiting and tuning SAT solvers under numerous scenarios require massive high-quality industry-level SAT instances, which unfortunately are quite limited in the real world. To address the data insufficiency issue, in this paper, we propose W2SAT, a framework to generate SAT formulas by learning intrinsic structures and properties from given real-world/industrial instances in an implicit fashion. To this end, we introduce a novel SAT representation called Weighted Literal Incidence Graph (WLIG), which exhibits strong representation ability and generalizability against existing counterparts, and can be efficiently generated via a specialized learning-based graph generative model. Decoding from WLIGs into SAT problems is then modeled as finding overlapping cliques with a novel hill-climbing optimization method termed Optimal Weight Coverage (OWC). Experiments demonstrate the superiority of our WLIG-induced approach in terms of graph metrics, efficiency, and scalability in comparison to previous methods. Additionally, we discuss the limitations of graph-based SAT generation for real-world applications, especially when utilizing generated instances for SAT solver parameter-tuning, and pose some potential directions.
翻译:布尔可满足性问题(SAT)在理论计算机科学中是一个极具代表性的NP完全问题,并在广泛的计算相关应用中扮演核心角色。在众多场景下开发和调优SAT求解器需要大量高质量的工业级SAT实例,然而现实世界中此类实例十分稀缺。为解决数据不足问题,本文提出W2SAT框架,通过隐式学习给定真实世界/工业实例的内在结构与属性来生成SAT公式。为此,我们引入一种新型SAT表示方法——加权文字出现图(WLIG),该表示方法相较于现有方法具有更强的表示能力和泛化性,并可通过专门的基于学习的图生成模型高效生成。从WLIG解码为SAT问题的过程被建模为寻找重叠团,并采用一种名为最优加权覆盖(OWC)的新型爬山优化方法。实验表明,与先前方法相比,基于WLIG的方法在图指标、效率和可扩展性方面均具有优越性。此外,我们讨论了基于图的SAT生成在现实应用中的局限性,特别是利用生成实例进行SAT求解器参数调优时的不足,并提出了潜在的研究方向。