Characterizing application-layer user throughput in next-generation networks is increasingly challenging as the higher capacity of the 5G Radio Access Network (RAN) shifts connectivity bottlenecks towards deeper parts of the network. Traditional methods, such as drive tests and operator equipment counters, are costly, limited, or fail to capture end-to-end (E2E) Quality of Service (QoS) and its variability. In this work, we leverage large-scale crowdsourced measurements-including E2E, radio, contextual and network deployment features collected by the user equipment (UE)-to propose an uncertainty-aware and explainable approach for downlink user throughput estimation. We first validate prior 4G methods, improving R^2 by 8.7%, and then extend them to 5G NSA and 5G SA, providing the first benchmarks for 5G crowdsourced datasets. To address the variability of throughput, we apply NGBoost, a model that outputs both point estimates and calibrated confidence intervals, representing its first use in the field of computer communications. Finally, we use the proposed model to analyze the evolution from 4G to 5G SA, and show that throughput bottlenecks move from the RAN to transport and service layers, as seen by E2E metrics gaining importance over radio-related features.
翻译:在下一代网络中,表征应用层用户吞吐量正变得日益复杂,因为5G无线接入网(RAN)更高的容量将连接瓶颈转移至网络的更深层。传统方法(如路测和运营商设备计数器)成本高昂、覆盖有限,且难以捕捉端到端(E2E)服务质量(QoS)及其变异性。本研究利用大规模众包测量数据——包括由用户设备(UE)采集的E2E、无线、上下文及网络部署特征——提出一种面向下行用户吞吐量估计的不确定性感知且可解释的方法。我们首先验证了现有4G方法,将R^2提升了8.7%,随后将其扩展至5G NSA与5G SA架构,为5G众包数据集提供了首个基准。为应对吞吐量的变异性,我们应用了NGBoost模型,该模型可同时输出点估计值与校准后的置信区间,此为该方法在计算机通信领域的首次应用。最后,我们利用所提模型分析了从4G到5G SA的演进过程,结果表明:随着E2E指标相比无线相关特征的重要性日益凸显,吞吐量瓶颈已从RAN转移至传输层与服务层。