Quadratic Unconstrained Binary Optimization (QUBO) is a generic technique to model various NP-hard combinatorial optimization problems in the form of binary variables. The Hamiltonian function is often used to formulate QUBO problems where it is used as the objective function in the context of optimization. Recently, PI-GNN, a generic scalable framework, has been proposed to address the Combinatorial Optimization (CO) problems over graphs based on a simple Graph Neural Network (GNN) architecture. Their novel contribution was a generic QUBO-formulated Hamiltonian-inspired loss function that was optimized using GNN. In this study, we address a crucial issue related to the aforementioned setup especially observed in denser graphs. The reinforcement learning-based paradigm has also been widely used to address numerous CO problems. Here we also formulate and empirically evaluate the compatibility of the QUBO-formulated Hamiltonian as the generic reward function in the Reinforcement Learning paradigm to directly integrate the actual node projection status during training as the form of rewards. In our experiments, we observed up to 44% improvement in the RL-based setup compared to the PI-GNN algorithm. Our implementation can be found in https://github.com/rizveeredwan/learning-graph-structure.
翻译:二次无约束二元优化(QUBO)是一种将各种NP难组合优化问题建模为二元变量的通用技术。哈密顿函数常被用于表述QUBO问题,在优化语境中作为目标函数使用。近期,研究提出了PI-GNN这一基于简单图神经网络(GNN)架构的通用可扩展框架,用于解决图上的组合优化(CO)问题。其创新贡献在于提出一种基于QUBO公式的哈密顿启发损失函数,并通过GNN进行优化。在本研究中,我们针对上述框架在稠密图中尤为突出的关键问题展开研究。强化学习范式也已广泛用于解决众多组合优化问题。本文同时提出并实证评估了将QUBO公式哈密顿函数作为强化学习范式通用奖励函数的兼容性,旨在通过奖励形式在训练中直接整合实际节点的投影状态。实验结果表明,与PI-GNN算法相比,本研究的强化学习框架性能提升高达44%。我们的实现代码详见 https://github.com/rizveeredwan/learning-graph-structure。