A User Next Location Prediction (UNLP) task, which predicts the next location that a user will move to given his/her trajectory, is an indispensable task for a wide range of applications. Previous studies using large-scale trajectory datasets in a single server have achieved remarkable performance in UNLP task. However, in real-world applications, legal and ethical issues have been raised regarding privacy concerns leading to restrictions against sharing human trajectory datasets to any other server. In response, Federated Learning (FL) has emerged to address the personal privacy issue by collaboratively training multiple clients (i.e., users) and then aggregating them. While previous studies employed FL for UNLP, they are still unable to achieve reliable performance because of the heterogeneity of clients' mobility. To tackle this problem, we propose the Federated Learning for Geographic Information (FedGeo), a FL framework specialized for UNLP, which alleviates the heterogeneity of clients' mobility and guarantees personal privacy protection. Firstly, we incorporate prior global geographic adjacency information to the local client model, since the spatial correlation between locations is trained partially in each client who has only a heterogeneous subset of the overall trajectories in FL. We also introduce a novel aggregation method that minimizes the gap between client models to solve the problem of client drift caused by differences between client models when learning with their heterogeneous data. Lastly, we probabilistically exclude clients with extremely heterogeneous data from the FL process by focusing on clients who visit relatively diverse locations. We show that FedGeo is superior to other FL methods for model performance in UNLP task. We also validated our model in a real-world application using our own customers' mobile phones and the FL agent system.
翻译:用户下一位置预测(User Next Location Prediction, UNLP)任务旨在根据用户轨迹预测其下一个移动位置,是众多应用场景中不可或缺的一项任务。以往在单一服务器上利用大规模轨迹数据集的研究在UNLP任务中取得了显著性能。然而,在实际应用中,隐私问题引发了法律与伦理层面的关注,导致人类轨迹数据集被限制向任何其他服务器共享。为此,联邦学习(Federated Learning, FL)通过协同训练多个客户端(即用户)并聚合模型,成为解决个人隐私问题的有效途径。尽管已有研究将FL应用于UNLP,但由于客户端移动行为的异质性,仍难以实现可靠的性能。针对这一问题,我们提出了专门面向UNLP的联邦地理信息学习框架(Federated Learning for Geographic Information, FedGeo),该框架能够缓解客户端移动异质性并保障个人隐私保护。首先,我们将全局地理邻接先验信息融入本地客户端模型,以解决FL中各客户端仅拥有整体轨迹的异质子集而导致的局部空间相关性训练不充分问题。同时,我们引入了一种新型聚合方法,通过最小化客户端模型间的差距,解决因客户端异质数据学习差异引发的模型偏移问题。最后,我们通过聚焦于访问位置相对多样化的客户端,以概率方式将异质性过高的客户端排除在FL过程之外。实验表明,FedGeo在UNLP任务中的模型性能优于其他FL方法。我们还基于自有客户的移动手机及联邦学习代理系统,在实际应用中验证了模型的有效性。