Quantized AI Inference on Constrained Embedded Platforms for Small-Satellite Settings

In resource-constrained small-satellite settings, AI inference must operate under tight size, power, and payload budgets, which tend to limit onboard compute capability and data handling. These conditions motivate establishing a clear baseline for quantized AI inference under bounded compute and memory resources. To instantiate this baseline, a representative embedded-vision neural-network workload serves as the reference case. With this motivation, this paper presents a measurement-based characterization of quantized execution for this AI workload on highly constrained embedded platforms (for instance, Cortex-M), grounded as a lower-bound operating point. In this regime, scaling tends to rely on explicit orchestration rather than OS-managed, transparent multicore scheduling, and timing behavior is shaped by instruction efficiency and memory movement. As a result, the characterization provides a structured reference for estimating execution time across orchestrated configurations (e.g., multiple cores and/or devices), treating orchestration and architectural variation as explicit design choices. We report latency metrics alongside data-movement observations, and interpret these measurements in light of ALU/SIMD utilization under quantized arithmetic for the Cortex-M. Finally, we outline how this baseline provides a reference point for positioning the results against more space-typical embedded processor classes (e.g., LEON/NOEL-V).

翻译：在资源受限的小卫星场景中，AI推理必须在严苛的体积、功率和有效载荷预算约束下运行，这些约束往往限制了星载计算能力与数据处理能力。此类条件促使我们确立一个在有限计算与内存资源下量化AI推理的清晰基线。为实例化该基线，本文选取具有代表性的嵌入式视觉神经网络工作负载作为参考案例。基于此动机，本文呈现了针对该AI工作负载在高度受限嵌入式平台（例如Cortex-M）上量化执行的实测表征，并将其定位为下界运行点。在该运行机制下，性能扩展往往依赖于显式编排而非操作系统管理的透明多核调度，时序行为则由指令效率与内存移动共同塑造。因此，该表征为评估跨编排配置（如多核和/或多设备）的预估执行时间提供了结构化参考，将编排方式与架构差异作为显式设计选择。我们报告了延迟指标与数据移动观测结果，并结合Cortex-M在量化算术下的ALU/SIMD利用率对这些测量数据进行了诠释。最后，我们概述了该基线如何为与更具航天特色的嵌入式处理器类别（例如LEON/NOEL-V）进行结果对比提供基准参考点。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

深度学习中泛化的量化、理解与改进

专知会员服务

17+阅读 · 2025年9月13日

《面向边缘智能应用的AI模型优化技术研究》139页

专知会员服务

43+阅读 · 2025年8月12日

【CMU博士论文】基于机器学习的可信科学推理

专知会员服务

16+阅读 · 2025年5月26日

《面向边缘AI应用的高性能高能效架构探索》156页

专知会员服务

37+阅读 · 2025年4月12日