Ultra-reliable low-latency vehicular communications (URLLC) require sufficient physical-layer (PHY) compute headroom at the network edge, where roadside units (RSUs) and compact next-generation base stations (gNBs) must meet strict timing constraints while co-hosting higher-layer services. In 5G New Radio (5G NR), low-density parity-check code (LDPC) decoding is a latency-sensitive iterative PHY workload whose cost scales with both workload parallelism and decoder iteration budget, making it a potential bottleneck on general-purpose central processing units (CPUs). This paper presents a reproducible, telemetry-backed microbenchmark derived from the Sionna LDPC5G baseline to characterize the compute headroom obtained through graphics processing unit (GPU) offload on compact heterogeneous edge platforms. We evaluate decoder behavior across multiple processor architectures and a wide range of batch sizes and iteration counts, with emphasis on dense operating regimes relevant to edge provisioning. Results show that GPU acceleration substantially increases LDPC throughput, reduces amortized decode service time, and shifts compute pressure away from the CPU, thereby improving the feasibility of meeting edge-RSU timing budgets under heavy parallel workloads. These findings indicate that GPU offload can provide substantial spare PHY compute margin for compact vehicular edge platforms, making dense decode workloads more practical within realistic edge power and timing constraints.
翻译:超可靠低时延车联网通信(URLLC)要求网络边缘具备充足的物理层计算余量,其中路侧单元(RSU)与紧凑型下一代基站(gNB)必须在承载高层服务的同时满足严格的时序约束。在5G新空口(5G NR)中,低密度奇偶校验码(LDPC)解码是一种对时延敏感的迭代式物理层计算任务,其开销随任务并行度与解码器迭代预算同步增长,可能成为通用中央处理器(CPU)的性能瓶颈。本文基于Sionna LDPC5G基准构建了一套可复现、支持遥测的微基准测试方案,用于表征在紧凑型异构边缘平台上通过图形处理器(GPU)卸载所获得的计算余量。我们在多种处理器架构上评估了解码器行为,覆盖了广泛的批量大小与迭代次数范围,重点关注与边缘资源配置相关的高密度运行状态。实验结果表明:GPU加速显著提升了LDPC吞吐量,降低了均摊解码服务时间,并将计算压力从CPU转移,从而增强了在高并行负载下满足边缘RSU时序预算的可行性。这些发现表明,GPU卸载能为紧凑型车联网边缘平台提供充足的物理层计算余量,使得高密度解码任务在现实的边缘功耗与时序约束下更具实施可行性。