The Energy Blind Spot: NVIDIA's Flagship Edge AI Hardware Cannot Support Process-Level Energy Attribution

Agentic AI workloads - where a single user goal triggers multi-step orchestration, tool calls, retries, and failure recovery - are being targeted for edge deployment, with NVIDIA, Dell, HP, ASUS, MSI, Acer, and Gigabyte all shipping GB10-based desktop AI systems in 2026. We recently demonstrated that orchestration structure dominates agentic energy cost, with workflows consuming 4.33x more energy per successful goal than linear baselines and OOI reaching 7.63x for multi-step reasoning tasks. Separately, Rajat et al. show that CPU-side processing accounts for up to 90.6% of total latency and 44% of total dynamic energy in agentic workloads. We report a systematic energy-observability audit of the ASUS Ascent GX10 (GB10 SoC) and find that the platform exposes no CPU energy counter, no INA power-rail monitor, no IPMI/BMC, and no SCMI powercap protocol through any supported software interface. The only on-device energy telemetry is instantaneous GPU power via NVML. We further discover that the MediaTek firmware already computes per-rail energy internally via an undocumented ACPI interface (SPBM), but NVIDIA states there are "no plans to expose CPU rail information." On-device per-process energy attribution - as performed on x86 via RAPL - is therefore not reproducible on this platform through supported interfaces. We formalize a hardware requirements specification for energy-attributed AI, propose an interim calibration bridge using external DC metering combined with GPU subtraction, and identify a standards-track path via SCMI powercap. Our findings motivate the low-carbon computing community to demand energy observability as a first-class hardware requirement.

翻译：代理型AI工作负载——即单一用户目标触发多步骤编排、工具调用、重试及故障恢复的流程——正被部署于边缘设备，英伟达、戴尔、惠普、华硕、微星、宏碁及技嘉均计划于2026年推出基于GB10的桌面AI系统。我们近期证明，编排结构主导代理型AI的能耗成本：工作流每次成功目标消耗的能耗是线性基线的4.33倍，而面向多步推理任务的面向对象集成（OOI）可达7.63倍。此外，Rajat等人指出，CPU端处理占代理型工作负载总延迟的90.6%及总动态能耗的44%。我们对华硕Ascent GX10（GB10 SoC）执行系统性能耗可观测性审计，发现该平台通过任何支持的软件接口均未暴露CPU能耗计数器、INA电源轨监控器、IPMI/BMC及SCMI功率上限协议。唯一可用的设备端能耗遥测数据是通过NVML获取的瞬时GPU功耗。我们进一步发现，联发科固件已通过未公开的ACPI接口（SPBM）在内部计算各电源轨能耗，但英伟达声明“无计划开放CPU电源轨信息”。因此，该平台无法通过支持的接口复现设备端逐进程能耗归因（x86架构通过RAPL实现）。我们为能耗归因型AI制定了硬件需求规范，提出基于外部直流计量与GPU功耗扣减的临时校准桥接方案，并指出通过SCMI功率上限协议通向标准化的路径。本研究发现激励低碳计算社区将能耗可观测性作为硬件的一等公民需求。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

25+阅读 · 3月8日

AI生成代码缺陷综述

专知会员服务

17+阅读 · 2025年12月8日

EdgeRunner AI：在本地设备关键军事任务中实现GPT-5级性能表现（附论文）

专知会员服务

29+阅读 · 2025年11月19日

AI专题·Agent：智能体基建厚积薄发，商业化应用曙光乍现

专知会员服务

34+阅读 · 2025年4月24日