Circuit cutting decomposes a large quantum circuit into smaller subcircuits executed independently; expectation values are recovered by classically combining subcircuit outcomes. Prior work characterises cutting overhead via subcircuit counts and sampling complexity, but its end-to-end impact on iterative, estimator-driven training pipelines remains under-measured from a systems perspective. We propose DistributedEstimator, a cut-aware estimator execution pipeline that treats circuit cutting as a staged distributed workload, instrumenting each query across four phases: partitioning, subexperiment generation, parallel execution, and classical reconstruction. Using logged runtime traces and learning outcomes on two binary classification workloads (Iris and MNIST), we quantify cutting overheads, scaling limits, and sensitivity to injected stragglers, and assess whether accuracy and robustness are preserved under matched training budgets. Reconstruction dominates per-query time -- a median of 53% and 95th percentile of 58% at three cuts -- bounding achievable speed-up under parallelism. Despite this, test accuracy is fully preserved on Iris and maintained without systematic degradation on MNIST across all cut configurations. Robustness under Gaussian noise and FGSM perturbations is similarly preserved, with several configurations matching or improving on the uncut baseline. Exponential growth of subexperiment counts (${O}(9^c)$ for CNOT-based decomposition) is a fundamental barrier limiting practical experimentation to small qubit counts. These results establish that practical scaling for learning workloads requires reducing and overlapping reconstruction, scheduling policies for barrier-dominated critical paths, and computationally efficient reconstruction strategies for larger qubit counts.
翻译:暂无翻译