In the past few years, there has been an explosive surge in the use of machine learning (ML) techniques to address combinatorial optimization (CO) problems, especially mixed-integer linear programs (MILPs). Despite the achievements, the limited availability of real-world instances often leads to sub-optimal decisions and biased solver assessments, which motivates a suite of synthetic MILP instance generation techniques. However, existing methods either rely heavily on expert-designed formulations or struggle to capture the rich features of real-world instances. To tackle this problem, we propose G2MILP, the first deep generative framework for MILP instances. Specifically, G2MILP represents MILP instances as bipartite graphs, and applies a masked variational autoencoder to iteratively corrupt and replace parts of the original graphs to generate new ones. The appealing feature of G2MILP is that it can learn to generate novel and realistic MILP instances without prior expert-designed formulations, while preserving the structures and computational hardness of real-world datasets, simultaneously. Thus the generated instances can facilitate downstream tasks for enhancing MILP solvers under limited data availability. We design a suite of benchmarks to evaluate the quality of the generated MILP instances. Experiments demonstrate that our method can produce instances that closely resemble real-world datasets in terms of both structures and computational hardness. The deliverables are released at https://miralab-ustc.github.io/L2O-G2MILP.
翻译:近年来,机器学习技术被广泛应用于解决组合优化问题,特别是混合整数线性规划(MILP)。尽管取得了诸多进展,但真实世界中实例的匮乏常导致求解器出现次优决策及评估偏差,由此催生了一系列合成MILP实例生成技术。然而现有方法要么过度依赖专家设计的公式,要么难以捕捉真实实例的丰富特征。为应对这一挑战,我们提出G2MILP——首个面向MILP实例的深度生成框架。具体而言,G2MILP将MILP实例表示为二分图,并应用掩码变分自编码器迭代式地破坏并替换原始图的局部结构以生成新实例。该框架的显著优势在于:无需预先设计的专家公式即可学习生成兼具新颖性与真实性的MILP实例,同时保留真实数据集的拓扑结构与计算难度。因此,生成的实例能有效促进数据匮乏场景下MILP求解器的下游优化任务。我们设计了一套综合基准来评估生成实例的质量。实验表明,本方法生成的实例在结构与计算复杂度两个维度均与真实数据集高度吻合。相关成果已开源至https://miralab-ustc.github.io/L2O-G2MILP。