Autonomous drone pursuit requires not only detecting drones but also predicting their trajectories in a manner that enables kinematically feasible interception. Existing tracking methods optimize for prediction accuracy but ignore pursuit feasibility, resulting in trajectories that are physically impossible to intercept 99.9% of the time. We propose Perception-to-Pursuit (P2P), a track-centric temporal reasoning framework that bridges detection and actionable pursuit planning. Our method represents drone motion as compact 8-dimensional tokens capturing velocity, acceleration, scale, and smoothness, enabling a 12-frame causal transformer to reason about future behavior. We introduce the Intercept Success Rate (ISR) metric to measure pursuit feasibility under realistic interceptor constraints. Evaluated on the Anti-UAV-RGBT dataset with 226 real drone sequences, P2P achieves 28.12 pixel average displacement error and 0.597 ISR, representing a 77% improvement in trajectory prediction and 597x improvement in pursuit feasibility over tracking-only baselines, while maintaining perfect drone classification accuracy (100%). Our work demonstrates that temporal reasoning over motion patterns enables both accurate prediction and actionable pursuit planning.
翻译:自主无人机追逐不仅需要检测无人机,还需要以能够实现运动学可行拦截的方式预测其轨迹。现有跟踪方法主要优化预测精度,但忽视了追逐可行性,导致产生的轨迹在99.9%的情况下物理上无法被拦截。我们提出了感知到追踪(P2P),这是一个以轨迹为中心的时序推理框架,它连接了检测与可执行的追逐规划。我们的方法将无人机运动表示为紧凑的8维令牌,捕捉速度、加速度、尺度和平滑度,使得一个12帧的因果Transformer能够推理未来行为。我们引入了拦截成功率(ISR)指标,用于衡量在现实拦截器约束下的追逐可行性。在包含226个真实无人机序列的Anti-UAV-RGBT数据集上进行评估,P2P实现了28.12像素的平均位移误差和0.597的ISR,相较于仅跟踪的基线方法,轨迹预测精度提高了77%,追逐可行性提高了597倍,同时保持了完美的无人机分类准确率(100%)。我们的工作表明,对运动模式进行时序推理能够同时实现精确预测和可执行的追逐规划。