Hybrid quantum-classical applications pose significant resource management challenges due to heterogeneity and dynamism in both infrastructure and workloads. Quantum-HPC environments integrate quantum processing units (QPUs) with diverse classical resources (CPUs, GPUs), while applications span coupling patterns from tightly coupled execution to loosely coupled task parallelism with varying resource requirements. Traditional HPC schedulers lack visibility into application semantics and cannot respond to fluctuating resource availability at runtime. This paper presents a middleware-based approach for adaptive resource, workload, and task management in hybrid quantum-HPC systems. We make four contributions: (i) a conceptual four-layer middleware architecture that decomposes management across workflow, workload, task, and resource levels, enabling application-aware scheduling over heterogeneous quantum-HPC resources; (ii) a set of execution motifs capturing interaction and coupling characteristics of hybrid applications, realized as quantum mini-apps for systematic workload characterization; (iii) Pilot-Quantum, a middleware framework built on the pilot abstraction that enables late binding and dynamic resource allocation, adapting to resource and workload dynamics at runtime; and (iv) Q-Dreamer, a performance modeling toolkit providing reusable components for informed workload partitioning, including a circuit-cutting optimizer that analytically derives optimal partitioning strategies. Evaluation on heterogeneous HPC platforms (Perlmutter, NVIDIA DGX with H100/B200 GPUs) demonstrates efficient multi-backend orchestration across CPUs, GPUs, and QPUs for diverse execution motifs. Q-Dreamer predicts optimal circuit cutting configurations with up to 82% accuracy.
翻译:混合量子-经典应用因其基础设施与工作负载的异构性和动态性,带来了显著的资源管理挑战。量子-HPC环境将量子处理单元(QPU)与多样化经典资源(CPU、GPU)相集成,而应用则涵盖从紧耦合执行到松耦合任务并行性等多种耦合模式,且资源需求各异。传统HPC调度器无法感知应用语义,也无法响应运行时资源可用性的波动。本文提出一种基于中间件的方法,用于混合量子-HPC系统中的自适应资源、工作负载与任务管理。我们做出四项贡献:(i)一种概念性的四层中间件架构,将管理职责分解至工作流、工作负载、任务和资源层面,从而实现对异构量子-HPC资源的应用感知调度;(ii)一组捕获混合应用交互与耦合特性的执行模式,并实现为量子迷你应用,用于系统化的工作负载表征;(iii)Pilot-Quantum,一个基于Pilot抽象构建的中间件框架,支持延迟绑定和动态资源分配,能够在运行时适应资源与工作负载的动态变化;以及(iv)Q-Dreamer,一个性能建模工具包,提供可重用的组件用于知情的工作负载划分,包括一个电路切割优化器,能够通过分析推导出最优划分策略。在异构HPC平台(Perlmutter、配备H100/B200 GPU的NVIDIA DGX)上的评估表明,该方法能够针对多种执行模式,在CPU、GPU和QPU之间实现高效的多后端编排。Q-Dreamer预测最优电路切割配置的准确率高达82%。