Volumetric videoconferencing enables immersive six Degrees of Freedom interactions by jointly transmitting visual appearance and 3D geometry. However, delivering volumetric video over today's networks remains challenging due to high bandwidth demands, strict real-time latency constraints, and frequent packet loss. Packet loss not only degrades visual quality but also corrupts geometric structure, leading to severe artifacts and video freezes that significantly degrade Quality of Experience. Existing solutions either optimize volumetric videos assuming reliable networks or focus on loss recovery for 2D video, and are insufficient for volumetric videoconferencing. In this paper, we present ReVo, a loss-resilient volumetric videoconferencing system that jointly recovers RGB and depth content under packet loss while meeting real-time constraints on desktop-grade hardware. ReVo leverages the insight that effective recovery requires a cross-layer, modality-aware design. It decouples volumetric video into RGB and depth streams, selectively protects critical content using network-layer FEC, and reconstructs corrupted non-critical frames using a post-decode neural recovery module. ReVo is implemented end-to-end over WebRTC and supports both traditional and neural video codecs. Our evaluations using real-world loss traces show that ReVo improves median SSIM by up to 32% (resp. 13%) for RGB (resp. depth) content and reduces video freezes by up to 95.7% compared to existing techniques.
翻译:体积视频会议通过联合传输视觉外观和3D几何信息,实现了沉浸式六自由度交互。然而,在现有网络上传输体积视频仍面临高带宽需求、严格实时延迟约束以及频繁丢包等挑战。丢包不仅会降低视觉质量,还会破坏几何结构,导致严重伪影和视频冻结,极大损害用户体验质量。现有解决方案要么假设网络可靠以优化体积视频,要么专注于2D视频的丢包恢复,均不足以应对体积视频会议场景。本文提出ReVo,一种具有抗丢包能力的体积视频会议系统,能在桌面级硬件上满足实时约束的同时,联合恢复丢包下的RGB和深度内容。ReVo基于关键洞察:有效恢复需要跨层且模态感知的设计。它将体积视频解耦为RGB和深度流,利用网络层前向纠错选择性保护关键内容,并通过解码后神经恢复模块重建受损的非关键帧。ReVo基于WebRTC实现端到端系统,同时支持传统和神经视频编解码器。使用真实丢包轨迹的评估表明,与现有技术相比,ReVo将RGB(深度)内容的中值SSIM最高提升32%(13%),并将视频冻结减少高达95.7%。