With advancements in multicore embedded systems, leakage power, exponentially tied to chip temperature, has surpassed dynamic power consumption. Energy-aware solutions use dynamic voltage and frequency scaling (DVFS) to mitigate overheating in performance-intensive scenarios, while software approaches allocate high-utilization tasks across core configurations in parallel systems to reduce power. However, existing heuristics lack per-core frequency monitoring, failing to address overheating from uneven core activity, and task assignments without detailed profiling overlook irregular execution patterns. We target OpenMP DAG workloads. Because makespan, energy, and thermal goals often conflict within a single benchmark, this work prioritizes performance (makespan) while reporting energy and thermal as secondary outcomes. To overcome these issues, we propose HiDVFS (a hierarchical multi-agent, performance-aware DVFS scheduler) for parallel systems that optimizes task allocation based on profiling data, core temperatures, and makespan-first objectives. It employs three agents: one selects cores and frequencies using profiler data, another manages core combinations via temperature sensors, and a third sets task priorities during resource contention. A makespan-focused reward with energy and temperature regularizers estimates future states and enhances sample efficiency. Experiments on the NVIDIA Jetson TX2 using the BOTS suite (9 benchmarks) compare HiDVFS against state-of-the-art approaches. With multi-seed validation (seeds 42, 123, 456), HiDVFS achieves the best finetuned performance with 4.16 plus/minus 0.58s average makespan (L10), representing a 3.44x speedup over GearDVFS (14.32 plus/minus 2.61s) and 50.4% energy reduction (63.7 kJ vs 128.4 kJ). Across all BOTS benchmarks, HiDVFS achieves an average 3.95x speedup and 47.1% energy reduction.
翻译:随着多核嵌入式系统的发展,与芯片温度呈指数相关的漏电功耗已超过动态功耗。能量感知解决方案采用动态电压频率调节(DVFS)来缓解性能密集型场景下的过热问题,而软件方法则通过将高利用率任务分配到并行系统的核心配置中以降低功耗。然而,现有启发式方法缺乏对单核频率的监测,无法解决核心活动不均导致的过热问题;且未经过详细性能剖析的任务分配会忽略不规则的执行模式。本文针对OpenMP有向无环图(DAG)工作负载展开研究。由于在单一基准测试中,完工时间、能量与热目标常相互冲突,本工作优先考虑性能(完工时间),并将能量与热指标作为次要结果进行报告。为克服上述问题,我们提出HiDVFS(一种层次化多智能体、性能感知的DVFS调度器),该系统基于性能剖析数据、核心温度以及完工时间优先目标优化并行系统中的任务分配。HiDVFS采用三个智能体:第一个利用剖析器数据选择核心与频率;第二个通过温度传感器管理核心组合;第三个在资源争用时设置任务优先级。系统采用以完工时间为焦点的奖励函数,辅以能量与温度正则化项,以预估未来状态并提升样本效率。在NVIDIA Jetson TX2平台上使用BOTS测试套件(包含9个基准程序)进行的实验中,HiDVFS与前沿方法进行了对比。经过多随机种子验证(种子42、123、456),HiDVFS取得了最佳微调性能,其平均完工时间为4.16±0.58秒(L10),相较于GearDVFS(14.32±2.61秒)实现了3.44倍的加速,同时能耗降低50.4%(63.7千焦 vs 128.4千焦)。在所有BOTS基准测试中,HiDVFS平均实现了3.95倍的加速和47.1%的能耗降低。