The prominence of embodied Artificial Intelligence (AI), which empowers robots to navigate, perceive, and engage within virtual environments, has attracted significant attention, owing to the remarkable advances in computer vision and large language models. Privacy emerges as a pivotal concern within the realm of embodied AI, as the robot accesses substantial personal information. However, the issue of privacy leakage in embodied AI tasks, particularly concerning reinforcement learning algorithms, has not received adequate consideration in research. This paper aims to address this gap by proposing an attack on the training process of the value-based algorithm and the gradient-based algorithm, utilizing gradient inversion to reconstruct states, actions, and supervisory signals. The choice of using gradients for the attack is motivated by the fact that commonly employed federated learning techniques solely utilize gradients computed based on private user data to optimize models, without storing or transmitting the data to public servers. Nevertheless, these gradients contain sufficient information to potentially expose private data. To validate our approach, we conducted experiments on the AI2THOR simulator and evaluated our algorithm on active perception, a prevalent task in embodied AI. The experimental results demonstrate the effectiveness of our method in successfully reconstructing all information from the data in 120 room layouts. Check our website for videos.
翻译:具身人工智能(AI)使机器人能够在虚拟环境中导航、感知与交互,其重要性因计算机视觉和大语言模型的显著进步而备受关注。在具身AI领域,隐私成为一个关键问题,因为机器人会访问大量个人信息。然而,具身AI任务中的隐私泄露问题,特别是与强化学习算法相关的隐私风险,尚未在研究中得到充分重视。本文旨在填补这一空白,提出一种针对基于价值的算法和基于梯度的算法训练过程的攻击方法,利用梯度反演技术重构状态、动作和监督信号。选择基于梯度的攻击是因为常用的联邦学习技术仅利用基于私有用户数据计算的梯度来优化模型,而不会在公共服务器上存储或传输原始数据。然而,这些梯度包含足够的信息,可能暴露私有数据。为验证我们的方法,我们在AI2THOR模拟器上进行了实验,并在具身AI的常见任务——主动感知上评估了算法性能。实验结果表明,我们的方法在120种房间布局中成功重构了数据中的所有信息。视频演示请访问我们的网站。