The Energy Blind Spot: NVIDIA's Flagship Edge AI Hardware Cannot Support Process-Level Energy Attribution

Agentic AI workloads - where a single user goal triggers multi-step orchestration, tool calls, retries, and failure recovery - are being targeted for edge deployment, with NVIDIA, Dell, HP, ASUS, MSI, Acer, and Gigabyte all shipping GB10-based desktop AI systems in 2026. We recently demonstrated that orchestration structure dominates agentic energy cost, with workflows consuming 4.33x more energy per successful goal than linear baselines and OOI reaching 7.63x for multi-step reasoning tasks. Separately, Raj et al. show that CPU-side processing accounts for up to 90.6% of total latency and 44% of total dynamic energy in agentic workloads. We report a systematic energy-observability audit of the ASUS Ascent GX10 (GB10 SoC) and find that the platform exposes no CPU energy counter, no INA power-rail monitor, no IPMI/BMC, and no SCMI powercap protocol through any supported software interface. The only on-device energy telemetry is instantaneous GPU power via NVML. We further discover that the MediaTek firmware already computes per-rail energy internally via an undocumented ACPI interface (SPBM), but NVIDIA states there are "no plans to expose CPU rail information." On-device per-process energy attribution - as performed on x86 via RAPL - is therefore not reproducible on this platform through supported interfaces. We formalize a hardware requirements specification for energy-attributed AI, propose an interim calibration bridge for per-domain energy decomposition - confirmed on the Acer Veriton GN100 where CPU energy accumulators are live - and identify a standards-track path via SCMI powercap. Our findings motivate the low-carbon computing community to demand energy observability as a first-class hardware requirement.

翻译：智能体型AI工作负载——其中单个用户目标会触发多步编排、工具调用、重试及故障恢复——正瞄准边缘部署场景，且英伟达、戴尔、惠普、华硕、微星、宏碁及技嘉均计划于2026年推出基于GB10的桌面AI系统。我们近期已证明：编排结构在智能体能耗成本中占主导地位，工作流每次成功目标消耗的能量是线性基准的4.33倍，而多步推理任务的OOI指标更达7.63倍。此外，Raj等人的研究表明，在智能体工作负载中，CPU侧处理占总延迟的90.6%、总动态能耗的44%。我们对华硕Ascent GX10（搭载GB10 SoC）进行了系统性能耗可观测性审计，发现该平台在任一受支持的软件接口上均未暴露CPU能耗计数器、INA电源轨监控器、IPMI/BMC及SCMI powercap协议。唯一可用的设备端能耗遥测数据，仅为通过NVML获取的瞬时GPU功耗。我们进一步发现：联发科固件已通过未公开的ACPI接口（SPBM）在内部计算每条电源轨的能耗，但英伟达明确表示“暂无计划公开CPU电源轨信息”。因此，在x86平台通过RAPL实现的设备端逐进程能耗归因功能，在此平台上无法通过受支持接口复现。我们正式制定了具备能耗归因能力的AI硬件需求规范，提出用于逐域能耗分解的临时校准桥梁方案——该方案已在宏碁Veriton GN100平台（其CPU能耗累加器已处于活跃状态）上得到验证——并识别出通过SCMI powercap标准的标准化演进路径。本研究结论将推动低碳计算社区将能耗可观测性作为一级硬件需求来主张。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

构建面向终端的 AI 编程智能体：脚手架、测试环境、上下文工程及实践经验

专知会员服务

25+阅读 · 3月8日

AI 智能体系统：体系架构、应用场景及评估范式

专知会员服务

70+阅读 · 1月6日

EdgeRunner AI：在本地设备关键军事任务中实现GPT-5级性能表现（附论文）

专知会员服务

29+阅读 · 2025年11月19日

AI专题·Agent：智能体基建厚积薄发，商业化应用曙光乍现

专知会员服务

34+阅读 · 2025年4月24日