In resource-constrained small-satellite settings, AI inference must operate under tight size, power, and payload budgets, which tend to limit onboard compute capability and data handling. These conditions motivate establishing a clear baseline for quantized AI inference under bounded compute and memory resources. To instantiate this baseline, a representative embedded-vision neural-network workload serves as the reference case. With this motivation, this paper presents a measurement-based characterization of quantized execution for this AI workload on highly constrained embedded platforms (for instance, Cortex-M), grounded as a lower-bound operating point. In this regime, scaling tends to rely on explicit orchestration rather than OS-managed, transparent multicore scheduling, and timing behavior is shaped by instruction efficiency and memory movement. As a result, the characterization provides a structured reference for estimating execution time across orchestrated configurations (e.g., multiple cores and/or devices), treating orchestration and architectural variation as explicit design choices. We report latency metrics alongside data-movement observations, and interpret these measurements in light of ALU/SIMD utilization under quantized arithmetic for the Cortex-M. Finally, we outline how this baseline provides a reference point for positioning the results against more space-typical embedded processor classes (e.g., LEON/NOEL-V).
翻译:在资源受限的小卫星场景中,AI推理必须在严苛的体积、功率和有效载荷预算约束下运行,这些约束往往限制了星载计算能力与数据处理能力。此类条件促使我们确立一个在有限计算与内存资源下量化AI推理的清晰基线。为实例化该基线,本文选取具有代表性的嵌入式视觉神经网络工作负载作为参考案例。基于此动机,本文呈现了针对该AI工作负载在高度受限嵌入式平台(例如Cortex-M)上量化执行的实测表征,并将其定位为下界运行点。在该运行机制下,性能扩展往往依赖于显式编排而非操作系统管理的透明多核调度,时序行为则由指令效率与内存移动共同塑造。因此,该表征为评估跨编排配置(如多核和/或多设备)的预估执行时间提供了结构化参考,将编排方式与架构差异作为显式设计选择。我们报告了延迟指标与数据移动观测结果,并结合Cortex-M在量化算术下的ALU/SIMD利用率对这些测量数据进行了诠释。最后,我们概述了该基线如何为与更具航天特色的嵌入式处理器类别(例如LEON/NOEL-V)进行结果对比提供基准参考点。