Indoor positioning using UWB technology has gained interest due to its centimeter-level accuracy potential. However, multipath effects and non-line-of-sight conditions cause ranging errors between anchors and tags. Existing approaches for mitigating these ranging errors rely on collecting large labeled datasets, making them impractical for real-world deployments. This paper proposes a novel self-supervised deep reinforcement learning approach that does not require labeled ground truth data. A reinforcement learning agent uses the channel impulse response as a state and predicts corrections to minimize the error between corrected and estimated ranges. The agent learns, self-supervised, by iteratively improving corrections that are generated by combining the predictability of trajectories with filtering and smoothening. Experiments on real-world UWB measurements demonstrate comparable performance to state-of-the-art supervised methods, overcoming data dependency and lack of generalizability limitations. This makes self-supervised deep reinforcement learning a promising solution for practical and scalable UWB-ranging error correction.
翻译:超宽带(UWB)技术因其厘米级精度的潜力,在室内定位领域受到广泛关注。然而,多径效应和非视距条件会导致基站与标签之间的测距误差。现有缓解此类测距误差的方法依赖于采集大量标注数据集,在实际部署中往往难以实现。本文提出一种无需标注地面真值数据的自监督深度强化学习新方法。强化学习智能体以信道冲激响应作为状态输入,通过预测校正量来最小化校正后距离与估计距离之间的误差。该方法通过结合轨迹可预测性与滤波平滑处理,以迭代优化校正量的方式实现自监督学习。在真实超宽带测量数据上的实验表明,其性能与当前最先进的监督方法相当,同时克服了数据依赖性强与泛化能力不足的限制。这使自监督深度强化学习成为实用化、可扩展的超宽带测距误差校正的可行解决方案。