Agentic AI workloads - where a single user goal triggers multi-step orchestration, tool calls, retries, and failure recovery - are being targeted for edge deployment, with NVIDIA, Dell, HP, ASUS, MSI, Acer, and Gigabyte all shipping GB10-based desktop AI systems in 2026. We recently demonstrated that orchestration structure dominates agentic energy cost, with workflows consuming 4.33x more energy per successful goal than linear baselines and OOI reaching 7.63x for multi-step reasoning tasks. Separately, Raj et al. show that CPU-side processing accounts for up to 90.6% of total latency and 44% of total dynamic energy in agentic workloads. We report a systematic energy-observability audit of the ASUS Ascent GX10 (GB10 SoC) and find that the platform exposes no CPU energy counter, no INA power-rail monitor, no IPMI/BMC, and no SCMI powercap protocol through any supported software interface. The only on-device energy telemetry is instantaneous GPU power via NVML. We further discover that the MediaTek firmware already computes per-rail energy internally via an undocumented ACPI interface (SPBM), but NVIDIA states there are "no plans to expose CPU rail information." On-device per-process energy attribution - as performed on x86 via RAPL - is therefore not reproducible on this platform through supported interfaces. We formalize a hardware requirements specification for energy-attributed AI, propose an interim calibration bridge for per-domain energy decomposition - confirmed on the Acer Veriton GN100 where CPU energy accumulators are live - and identify a standards-track path via SCMI powercap. Our findings motivate the low-carbon computing community to demand energy observability as a first-class hardware requirement.
翻译:智能体型AI工作负载——其中单个用户目标会触发多步编排、工具调用、重试及故障恢复——正瞄准边缘部署场景,且英伟达、戴尔、惠普、华硕、微星、宏碁及技嘉均计划于2026年推出基于GB10的桌面AI系统。我们近期已证明:编排结构在智能体能耗成本中占主导地位,工作流每次成功目标消耗的能量是线性基准的4.33倍,而多步推理任务的OOI指标更达7.63倍。此外,Raj等人的研究表明,在智能体工作负载中,CPU侧处理占总延迟的90.6%、总动态能耗的44%。我们对华硕Ascent GX10(搭载GB10 SoC)进行了系统性能耗可观测性审计,发现该平台在任一受支持的软件接口上均未暴露CPU能耗计数器、INA电源轨监控器、IPMI/BMC及SCMI powercap协议。唯一可用的设备端能耗遥测数据,仅为通过NVML获取的瞬时GPU功耗。我们进一步发现:联发科固件已通过未公开的ACPI接口(SPBM)在内部计算每条电源轨的能耗,但英伟达明确表示“暂无计划公开CPU电源轨信息”。因此,在x86平台通过RAPL实现的设备端逐进程能耗归因功能,在此平台上无法通过受支持接口复现。我们正式制定了具备能耗归因能力的AI硬件需求规范,提出用于逐域能耗分解的临时校准桥梁方案——该方案已在宏碁Veriton GN100平台(其CPU能耗累加器已处于活跃状态)上得到验证——并识别出通过SCMI powercap标准的标准化演进路径。本研究结论将推动低碳计算社区将能耗可观测性作为一级硬件需求来主张。