Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much research has been conducted in the real-time research community, several limitations persist, including the absence or limited availability of GPU-level preemption, extended blocking times, and/or the need for extensive modifications to program code. In this paper, we propose GCAPS, a GPU Context-Aware Preemptive Scheduling approach for real-time GPU tasks. Our approach exerts control over GPU context scheduling at the device driver level and enables preemption of GPU execution based on task priorities by simply adding one-line macros to GPU segment boundaries. In addition, we provide a comprehensive response time analysis of GPU-using tasks for both our proposed approach as well as the default Nvidia GPU driver scheduling that follows a work-conserving round-robin policy. Through empirical evaluations and case studies, we demonstrate the effectiveness of the proposed approaches in improving taskset schedulability and response time. The results highlight significant improvements over prior work as well as the default scheduling approach, with up to 40% higher schedulability, while also achieving predictable worst-case behavior on Nvidia Jetson embedded platforms.
翻译:利用GPU执行实时任务并给出可分析保证的调度方法,因CPU与GPU资源间的复杂交互以及GPU软硬件栈的复杂性而面临严峻挑战。尽管实时研究领域已开展大量工作,但仍存在若干局限性,包括缺乏或仅支持有限的GPU级抢占、阻塞时间过长和/或需要大量修改程序代码。本文提出GCAPS——一种面向实时GPU任务的GPU上下文感知抢占式调度方法。该方法在设备驱动层实现对GPU上下文调度的控制,通过向GPU段边界添加单行宏指令即可实现基于任务优先级的GPU执行抢占。此外,我们针对所提方法及遵循工作保守轮询策略的默认NVIDIA GPU驱动调度方案,提供了完整的GPU任务响应时间分析。通过实验评估与案例研究,我们验证了所提方法在提升任务集可调度性和响应时间方面的有效性。结果表明,与先前工作及默认调度方案相比,该方法在NVIDIA Jetson嵌入式平台上实现了高达40%的可调度性提升,同时保持了可预测的最坏情况行为。