3D object detection plays a pivotal role in many applications, most notably autonomous driving and robotics. These applications are commonly deployed on edge devices to promptly interact with the environment, and often require near real-time response. With limited computation power, it is challenging to execute 3D detection on the edge using highly complex neural networks. Common approaches such as offloading to the cloud induce significant latency overheads due to the large amount of point cloud data during transmission. To resolve the tension between wimpy edge devices and compute-intensive inference workloads, we explore the possibility of empowering fast 2D detection to extrapolate 3D bounding boxes. To this end, we present Moby, a novel system that demonstrates the feasibility and potential of our approach. We design a transformation pipeline for Moby that generates 3D bounding boxes efficiently and accurately based on 2D detection results without running 3D detectors. Further, we devise a frame offloading scheduler that decides when to launch the 3D detector judiciously in the cloud to avoid the errors from accumulating. Extensive evaluations on NVIDIA Jetson TX2 with real-world autonomous driving datasets demonstrate that Moby offers up to 91.9% latency improvement with modest accuracy loss over state of the art.
翻译:三维目标检测在众多应用中扮演着关键角色,尤其是在自动驾驶和机器人领域。这些应用通常部署在边缘设备上,以快速与环境交互,并常需近实时响应。在有限计算资源条件下,利用高度复杂的神经网络在边缘执行三维检测极具挑战性。传统方法如将任务卸载至云端,会因点云数据传输量大而引入显著延迟开销。为解决弱边缘设备与计算密集型推理工作负载之间的矛盾,我们探索了利用快速二维检测推断三维边界框的可行性。为此,我们提出Moby这一创新系统,展示了该方法的可行性与潜力。我们为Moby设计了一套转换流水线,基于二维检测结果高效准确地生成三维边界框,无需运行三维检测器。此外,我们设计了一种帧卸载调度器,可审慎决定何时在云端启动三维检测器,以避免误差累积。基于NVIDIA Jetson TX2平台及真实自动驾驶数据集的广泛评估表明,与现有最优方法相比,Moby在精度适度损失的情况下可实现高达91.9%的延迟改善。