vAccSOL: Efficient and Transparent AI Vision Offloading for Mobile Robots

Mobile robots are increasingly deployed for inspection, patrol, and search-and-rescue operations, relying on computer vision for perception, navigation, and autonomous decision-making. However, executing modern vision workloads onboard is challenging due to limited compute resources and strict energy constraints. While some platforms include embedded accelerators, these are typically tied to proprietary software stacks, leaving user-defined workloads to run on resource-constrained companion computers. We present vAccSOL, a framework for efficient and transparent execution of AI-based vision workloads across heterogeneous robotic and edge platforms. vAccSOL integrates two components: SOL, a neural network compiler that generates optimized inference libraries with minimal runtime dependencies, and vAccel, a lightweight execution framework that transparently dispatches inference locally on the robot or to nearby edge infrastructure. This combination enables hardware-optimized inference and flexible execution placement without requiring modifications to robot applications. We evaluate vAccSOL on a real-world testbed with a commercial quadruped robot and twelve deep learning models covering image classification, video classification, and semantic segmentation. Compared to a PyTorch compiler baseline, SOL achieves comparable or better inference performance. With edge offloading, vAccSOL reduces robot-side power consumption by up to 80% and edge-side power by up to 60% compared to PyTorch, while increasing vision pipeline frame rate by up to 24x, extending the operating lifetime of battery-powered robots.

翻译：移动机器人正日益广泛地部署于巡检、巡逻和搜救等任务，其依赖于计算机视觉实现环境感知、自主导航与决策。然而，由于机载计算资源有限且受严格的能耗约束，在机器人本体上执行现代视觉计算任务面临挑战。尽管部分平台集成了嵌入式加速器，但这些加速器通常与专有软件栈深度绑定，导致用户自定义的计算负载只能在资源受限的伴随计算机上运行。本文提出vAccSOL框架，旨在实现跨异构机器人平台与边缘计算平台的高效透明人工智能视觉任务执行。vAccSOL整合了两个核心组件：SOL——一种能够生成具备最小运行时依赖的优化推理库的神经网络编译器；以及vAccel——一个轻量级执行框架，可透明地将推理任务调度至机器人本地或邻近的边缘基础设施执行。该组合方案在无需修改机器人应用程序的前提下，实现了硬件优化的推理计算与灵活的执行位置部署。我们在包含商用四足机器人的真实测试平台上对vAccSOL进行了评估，测试涵盖图像分类、视频分类和语义分割三大类共十二个深度学习模型。与基于PyTorch编译器的基准方案相比，SOL实现了相当或更优的推理性能。通过边缘卸载机制，相较于PyTorch方案，vAccSOL将机器人端功耗降低最高达80%，边缘端功耗降低最高达60%，同时将视觉处理流水线的帧率提升最高达24倍，从而显著延长了电池供电机器人的持续作业时间。