CCTV-based vehicle tracking systems face structural limitations in continuously connecting the trajectories of the same vehicle across multiple camera environments. In particular, blind spots occur due to the intervals between CCTVs and limited Fields of View (FOV), which leads to object ID switching and trajectory loss, thereby reducing the reliability of real-time path prediction. This paper proposes SPOT (Spatial Prediction Over Trajectories), a map-guided LLM agent capable of tracking vehicles even in blind spots of multi-CCTV environments without prior training. The proposed method represents road structures (Waypoints) and CCTV placement information as documents based on 2D spatial coordinates and organizes them through chunking techniques to enable real-time querying and inference. Furthermore, it transforms the vehicle's position into the actual world coordinate system using the relative position and FOV information of objects observed in CCTV images. By combining map spatial information with the vehicle's moving direction, speed, and driving patterns, a beam search is performed at the intersection level to derive candidate CCTV locations where the vehicle is most likely to enter after the blind spot. Experimental results based on the CARLA simulator in a virtual city environment confirmed that the proposed method accurately predicts the next appearing CCTV even in blind spot sections, maintaining continuous vehicle trajectories more effectively than existing techniques.
翻译:基于闭路电视的车辆跟踪系统在多摄像头环境中持续关联同一车辆的轨迹方面面临结构性限制。特别是由于CCTV之间的间隔及有限的视场角导致的盲区,会造成目标ID切换与轨迹丢失,从而降低实时路径预测的可靠性。本文提出SPOT(轨迹空间预测),一种基于地图引导的大语言模型智能体,能够在无需先验训练的情况下,于多CCTV环境的盲区中持续跟踪车辆。该方法将道路结构(路径点)与CCTV布设信息表示为基于二维空间坐标的文档,并通过分块技术进行组织以实现实时查询与推理。此外,利用CCTV图像中观测目标的相对位置与视场角信息,将车辆位置转换至实际世界坐标系。通过结合地图空间信息与车辆行驶方向、速度及驾驶模式,在交叉口层级执行束搜索,以推导车辆驶离盲区后最可能进入的候选CCTV位置。基于CARLA模拟器在虚拟城市环境中的实验结果表明,所提方法即使在盲区路段也能准确预测车辆下一次出现的CCTV位置,相较于现有技术能更有效地维持车辆轨迹的连续性。