In this paper, we propose a reinforcement learning based beam weighting framework that couples a policy network with an augmented weighted least squares (WLS) estimator for accurate and low-complexity positioning in multi-beam LEO constellations. Unlike conventional geometry or CSI-dependent approaches, the policy learns directly from uplink pilot responses and geometry features, enabling robust localization without explicit CSI estimation. An augmented WLS jointly estimates position and receiver clock bias, improving numerical stability under dynamic beam geometry. Across representative scenarios, the proposed method reduces the mean positioning error by 99.3% compared with the geometry-based baseline, achieving 0.395 m RMSE with near real-time inference.
翻译:本文提出一种基于强化学习的波束加权框架,该框架将策略网络与增强型加权最小二乘估计器相结合,用于多波束低轨星座中实现高精度、低复杂度的定位。与传统依赖几何构型或信道状态信息的方法不同,该策略直接通过上行链路导频响应与几何特征进行学习,从而无需显式信道状态信息估计即可实现鲁棒定位。增强型加权最小二乘估计器联合估计位置与接收机时钟偏差,提升了动态波束几何条件下的数值稳定性。在典型场景中,所提方法相较于基于几何的基线方法将平均定位误差降低了99.3%,以近实时推理实现了0.395米的均方根误差。