Multi-camera tracking with overlapping fields of view typically relies on centralized fusion, which creates computational bottlenecks that prevent deployment at scale. We present MV3DT, a fully distributed framework for real-time multi-view 3D tracking that achieves accurate identity propagation and occlusion recovery through peer-to-peer coordination, eliminating the need for central aggregation. Each camera node executes a lightweight modular pipeline comprising monocular 3D perception, distributed multi-view association, and collaborative fusion via lightweight messaging. MV3DT achieves 94.3% IDF1 and 93.3% MOTA on WILDTRACK, competitive with state-of-the-art centralized methods, while demonstrating superior scalability by sustaining 30 FPS on 100 cameras with less than 10 ms inter-camera latency and only 2.2% communication overhead. MV3DT operates in a zero-shot regime given camera calibrations, requiring no scene-specific learning and making it directly deployable in new environments. These results establish MV3DT as a practical solution for real-time multi-view tracking in large-scale overlapping camera networks.
翻译:多视角摄像机跟踪常依赖集中式融合,但这会产生计算瓶颈,阻碍大规模部署。我们提出MV3DT——一种全分布式实时多视角3D跟踪框架,通过点对点协作实现精确的身份传播与遮挡恢复,无需中央聚合。每个摄像机节点执行轻量级模块化流程,包括单目3D感知、分布式多视角关联以及通过轻量级消息传递实现的协同融合。在WILDTRACK数据集上,MV3DT达到94.3%的IDF1和93.3%的MOTA,与最先进的集中式方法相当;同时展现出卓越的可扩展性:在100个摄像头上以30 FPS持续运行,摄像机间延迟低于10毫秒,通信开销仅为2.2%。给定摄像机标定参数后,MV3DT可在零样本模式下运行,无需场景特定学习,能直接部署于新环境。这些结果证明MV3DT是大规模重叠摄像机网络中实时多视角跟踪的实用解决方案。