With the popularity of 3D volumetric video applications, such as Autonomous Driving, Virtual Reality, and Mixed Reality, current developers have turned to deep learning for compressing volumetric video frames, i.e., point clouds for video upstreaming. The latest deep learning-based solutions offer higher efficiency, lower distortion, and better hardware support compared to traditional ones like MPEG and JPEG. However, privacy threats arise, especially reconstruction attacks targeting to recover the original input point cloud from the intermediate results. In this paper, we design VVRec, to the best of our knowledge, which is the first targeting DL-based Volumetric Video Reconstruction attack scheme. VVRec demonstrates the ability to reconstruct high-quality point clouds from intercepted transmission intermediate results using four well-trained neural network modules we design. Leveraging the latest latent diffusion models with Gamma distribution and a refinement algorithm, VVRec excels in reconstruction quality, color recovery, and surpasses existing defenses. We evaluate VVRec using three volumetric video datasets. The results demonstrate that VVRec achieves 64.70dB reconstruction accuracy, with an impressive 46.39% reduction of distortion over baselines.
翻译:随着自动驾驶、虚拟现实和混合现实等三维体视频应用的普及,当前开发者已转向采用深度学习技术来压缩体视频帧(即点云)以实现视频上行传输。相较于MPEG、JPEG等传统方案,最新的基于深度学习的解决方案在效率、失真度和硬件支持方面表现更优。然而,隐私威胁随之浮现,特别是旨在从中间结果恢复原始输入点云的重建攻击。本文设计了VVRec——据我们所知,这是首个针对基于深度学习的体视频上行传输的重建攻击方案。VVRec通过我们设计的四个训练有素的神经网络模块,展示了从截获的传输中间结果中重建高质量点云的能力。该方法利用基于伽马分布的最新潜在扩散模型与精细化算法,在重建质量、色彩还原度方面表现卓越,并能够突破现有防御机制。我们使用三个体视频数据集对VVRec进行评估。实验结果表明,VVRec实现了64.70dB的重建精度,相较于基线方法将失真度降低了46.39%。