Indoor service robots need perception that is robust, more privacy-friendly than RGB video, and feasible on embedded hardware. We present a camera-free 2D LiDAR object detection pipeline that encodes short-term temporal context by stacking three consecutive scans as RGB channels, yielding a compact YOLOv8n input without occupancy-grid construction while preserving angular structure and motion cues. Evaluated in Webots across 160 randomized indoor scenarios with strict scenario-level holdout, the method achieves 98.4% mAP@0.5 (0.778 mAP@0.5:0.95) with 94.9% precision and 94.7% recall on four object classes. On a Raspberry Pi 5, it runs in real time with a mean post-warm-up end-to-end latency of 47.8ms per frame, including scan encoding and postprocessing. Relative to a closely related occupancy-grid LiDAR-YOLO pipeline reported on the same platform, the proposed representation is associated with substantially lower reported end-to-end latency. Although results are simulation-based, they suggest that lightweight temporal encoding can enable accurate and real-time LiDAR-only detection for embedded indoor robotics without capturing RGB appearance.
翻译:室内服务机器人需要具备鲁棒的感知能力,其隐私友好性应优于RGB视频,并能在嵌入式硬件上实现。我们提出了一种无需摄像头的二维激光雷达目标检测流程,该方法通过将连续三帧扫描数据堆叠为RGB通道来编码短期时序上下文,从而生成紧凑的YOLOv8n输入,无需构建占据栅格图,同时保留了角度结构信息和运动线索。在Webots仿真环境中,通过对160个随机生成的室内场景进行严格场景级留出评估,该方法在四类目标上实现了98.4%的mAP@0.5(0.778 mAP@0.5:0.95),精确率为94.9%,召回率为94.7%。在树莓派5平台上,系统可实现实时运行,预热后平均每帧端到端延迟为47.8毫秒,包含扫描编码与后处理环节。相较于同一平台上已报道的密切相关的占据栅格LiDAR-YOLO流程,所提出的表征方式对应的端到端延迟显著降低。尽管结果基于仿真实验,但表明轻量级时序编码能够为嵌入式室内机器人实现准确且实时的纯激光雷达检测,而无需捕获RGB外观信息。