Microservice-based applications are characterized by stochastic latencies arising from long-tail execution patterns and heterogeneous resource constraints across computational nodes. To address this challenge, we first formulate the problem using Quadratic Unconstrained Binary Optimization (QUBO), which aligns the problem with emerging quantum-optimization paradigms. Building upon this, we propose Q-GARS (Quantum-Guided Adaptive Robust Scheduling), a hybrid framework that integrates the QUBO model with Simulated Quantum Annealing (SQA) based combinatorial search and online rescheduling mechanisms, enabling global microservice rank generation and real-time robust adjustment. We treat the SQA-produced rank as a soft prior, and update a closed-loop trust weight to adaptively switch and mix between this prior and a robust proportional-fairness allocator, maintaining robustness under prediction failures and runtime disturbances. Simulation results demonstrate that Q-GARS achieves an average weighted completion time improvement of 2.1\% relative to a greedy baseline of the remaining shortest processing-time (SRPT), with performance gains reaching up to 16.8\% in heavy-tailed latency. The adaptive mechanism reduces tail latency under high-variance conditions. In addition, Q-GARS achieves a mean node resource utilization rate of 0.817, which is 1.1 percentage points above the robust baseline (0.806).
翻译:基于微服务的应用面临由长尾执行模式和跨计算节点的异构资源约束引起的随机延迟挑战。为解决这一问题,我们首先使用二次无约束二元优化(QUBO)对问题进行建模,使其与新兴的量子优化范式对齐。在此基础上,我们提出Q-GARS(量子引导的自适应鲁棒调度),一个集成QUBO模型与基于模拟量子退火(SQA)的组合搜索及在线重调度机制的混合框架,实现全局微服务排名生成与实时鲁棒调整。我们将SQA生成的排名视为软先验,并更新闭环信任权重,自适应地在先验与鲁棒比例公平分配器之间切换与混合,从而在预测失败和运行时干扰下保持鲁棒性。仿真结果表明,相对于基于剩余最短处理时间(SRPT)的贪婪基线,Q-GARS实现了平均加权完成时间改进2.1%,在重尾延迟情况下性能提升高达16.8%。自适应机制在高方差条件下减少了尾部延迟。此外,Q-GARS实现了平均节点资源利用率0.817,比鲁棒基线(0.806)高1.1个百分点。