Hardware acceleration can revolutionize robotics, enabling new applications by speeding up robot response times while remaining power-efficient. However, the diversity of acceleration options makes it difficult for roboticists to easily deploy accelerated systems without expertise in each specific hardware platform. In this work, we address this challenge with RobotCore, an architecture to integrate hardware acceleration in the widely-used ROS 2 robotics software framework. This architecture is target-agnostic (supports edge, workstation, data center, or cloud targets) and accelerator-agnostic (supports both FPGAs and GPUs). It builds on top of the common ROS 2 build system and tools and is easily portable across different research and commercial solutions through a new firmware layer. We also leverage the Linux Tracing Toolkit next generation (LTTng) for low-overhead real-time tracing and benchmarking. To demonstrate the acceleration enabled by this architecture, we use it to deploy a ROS 2 perception computational graph on a CPU and FPGA. We employ our integrated tracing and benchmarking to analyze bottlenecks, uncovering insights that guide us to improve FPGA communication efficiency. In particular, we design an intra-FPGA ROS 2 node communication queue to enable faster data flows, and use it in conjunction with FPGA-accelerated nodes to achieve a 24.42% speedup over a CPU.
翻译:摘要:硬件加速可推动机器人技术革新,通过加速机器人响应时间并保持能效优势,从而赋能新型应用。然而,加速方案的多样性导致机器人专家若缺乏特定硬件平台的专业知识,便难以便捷部署加速系统。本研究提出RobotCore架构以应对该挑战,该架构可在广泛应用的ROS 2机器人软件框架中集成硬件加速。该架构具有目标无关性(支持边缘、工作站、数据中心或云端目标)与加速器无关性(同时支持FPGA与GPU)。它基于ROS 2通用构建系统与工具进行开发,并通过新型固件层实现跨不同研究与商业解决方案的便捷移植。我们同时利用下一代Linux跟踪工具包(LTTng)实现低开销的实时跟踪与基准测试。为展示该架构带来的加速效果,我们将其应用于CPU与FPGA上部署的ROS 2感知计算图。通过集成跟踪与基准测试分析瓶颈,我们发现提升FPGA通信效率的关键线索。具体而言,我们设计了FPGA内部的ROS 2节点通信队列以实现更快数据流,并结合FPGA加速节点实现较CPU提升24.42%的加速比。