Autonomous drone pursuit requires not only detecting drones but also predicting their trajectories in a manner that enables kinematically feasible interception. Existing tracking methods optimize for prediction accuracy but ignore pursuit feasibility, resulting in trajectories that are physically impossible to intercept 99.9% of the time. We propose Perception-to-Pursuit (P2P), a track-centric temporal reasoning framework that bridges detection and actionable pursuit planning. Our method represents drone motion as compact 8-dimensional tokens capturing velocity, acceleration, scale, and smoothness, enabling a 12-frame causal transformer to reason about future behavior. We introduce the Intercept Success Rate (ISR) metric to measure pursuit feasibility under realistic interceptor constraints. Evaluated on the Anti-UAV-RGBT dataset with 226 real drone sequences, P2P achieves 28.12 pixel average displacement error and 0.597 ISR, representing a 77% improvement in trajectory prediction and 597x improvement in pursuit feasibility over tracking-only baselines, while maintaining perfect drone classification accuracy (100%). Our work demonstrates that temporal reasoning over motion patterns enables both accurate prediction and actionable pursuit planning.
翻译:自主无人机追逐不仅需要检测无人机,还需要以能够实现运动学可行拦截的方式预测其轨迹。现有跟踪方法以预测精度为优化目标,却忽视了追逐可行性,导致产生的轨迹在99.9%的情况下物理上无法被拦截。我们提出了感知到追踪(P2P),一个连接检测与可执行追逐规划的轨迹中心化时序推理框架。我们的方法将无人机运动表示为捕捉速度、加速度、尺度和平滑度的紧凑8维令牌,使一个12帧因果Transformer能够推理未来行为。我们引入了拦截成功率(ISR)指标,以在现实拦截器约束下衡量追逐可行性。在包含226个真实无人机序列的Anti-UAV-RGBT数据集上评估,P2P实现了28.12像素的平均位移误差和0.597的ISR,相较于纯跟踪基线,轨迹预测精度提升了77%,追逐可行性提升了597倍,同时保持了完美的无人机分类准确率(100%)。我们的工作表明,对运动模式的时序推理能够同时实现精确预测和可执行的追逐规划。