PFL-LSTR: A privacy-preserving framework for driver intention inference based on in-vehicle and out-vehicle information

Intelligent vehicle anticipation of the movement intentions of other drivers can reduce collisions. Typically, when a human driver of another vehicle (referred to as the target vehicle) engages in specific behaviors such as checking the rearview mirror prior to lane change, a valuable clue is therein provided on the intentions of the target vehicle's driver. Furthermore, the target driver's intentions can be influenced and shaped by their driving environment. For example, if the target vehicle is too close to a leading vehicle, it may renege the lane change decision. On the other hand, a following vehicle in the target lane is too close to the target vehicle could lead to its reversal of the decision to change lanes. Knowledge of such intentions of all vehicles in a traffic stream can help enhance traffic safety. Unfortunately, such information is often captured in the form of images/videos. Utilization of personally identifiable data to train a general model could violate user privacy. Federated Learning (FL) is a promising tool to resolve this conundrum. FL efficiently trains models without exposing the underlying data. This paper introduces a Personalized Federated Learning (PFL) model embedded a long short-term transformer (LSTR) framework. The framework predicts drivers' intentions by leveraging in-vehicle videos (of driver movement, gestures, and expressions) and out-of-vehicle videos (of the vehicle's surroundings - frontal/rear areas). The proposed PFL-LSTR framework is trained and tested through real-world driving data collected from human drivers at Interstate 65 in Indiana. The results suggest that the PFL-LSTR exhibits high adaptability and high precision, and that out-of-vehicle information (particularly, the driver's rear-mirror viewing actions) is important because it helps reduce false positives and thereby enhances the precision of driver intention inference.

翻译：智能车辆对其它驾驶员运动意图的预判能够减少碰撞事故。当目标车辆的人类驾驶员在变道前做出检查后视镜等特定行为时，这为推断目标车辆驾驶员的意图提供了有价值的线索。此外，目标驾驶员的意图会受到驾驶环境的影响和塑造。例如，若目标车辆与前车距离过近，驾驶员可能放弃变道决策；而目标车道后车与目标车辆距离过近时，则可能导致变道决策的逆转。掌握交通流中所有车辆的此类意图有助于提升交通安全。然而，这类信息通常以图像/视频形式采集，利用包含个人身份信息的数据训练通用模型可能侵犯用户隐私。联邦学习是解决这一难题的有效工具，它能在不暴露原始数据的前提下高效训练模型。本文提出一种嵌入长短期变换器框架的个性化联邦学习模型，通过利用车内视频（驾驶员动作、手势、表情）与车外视频（车辆周围环境——前/后方区域）预测驾驶员意图。所提出的PFL-LSTR框架使用印第安纳州65号州际公路采集的人类驾驶员真实驾驶数据进行训练与测试。结果表明，PFL-LSTR具有高适应性与高精度，且车外信息（特别是驾驶员后视镜观察动作）对降低误报率、提升驾驶员意图推断精度具有重要作用。