CAGS: Color-Adaptive Volumetric Video Streaming with Dynamic 3D Gaussian Splatting

Volumetric video (VV) streaming enables real-time, immersive access to remote 3D environments, powering telepresence, ecological monitoring, and robotic teleoperation. These applications turn VV streaming into a real-time interface to remote physical environments, imposing new system-level demands for photorealistic scene representation, low-latency interaction, and robust performance under heterogeneous networks. 3D Gaussian Splatting (3DGS) has been widely used for real-time photorealistic rendering, offering superior visual quality and rendering performance, but it faces challenges due to bandwidth consumption. Furthermore, as the foundation of adaptive VV streaming, existing Levels of Detail (LoD) methods based on density are not well-suited to Gaussian representations, leading to visible gaps and severe quality degradation. Recent studies have also explored attribute compression techniques to reduce bandwidth consumption. Our preliminary studies reveal that aggressive attribute compression primarily causes color distortion, which can be effectively corrected in the rendered image using a reference image. Motivated by these findings, we propose a novel Color-Adaptive scheme for adaptive VV streaming that uses vector quantization (VQ) to establish LoDs and correct color distortions with low-resolution reference images. We further present CAGS, an adaptive VV streaming system compatible with diverse Gaussian representations, which integrates the Color-Adaptive scheme by rendering reference images on the streaming server and performing color restoration on the client. Extensive experiments on our prototype system demonstrate that CAGS outperforms the existing adaptive streaming systems in PSNR by 5$\sim$20 dB under fluctuating bandwidth, operates significantly faster than existing scalable Gaussian compression methods, and generalizes across different Gaussian representations.

翻译：体视频流传输技术能够实现对远程三维环境的实时沉浸式访问，支撑远程呈现、生态监测和机器人远程操作等应用。这类应用将体视频流转化为远程物理环境的实时交互界面，对系统层面提出了照片级真实感场景表示、低延迟交互以及在异构网络下保持稳健性能的新需求。3D高斯泼溅（3DGS）已被广泛用于实现实时照片级真实感渲染，提供优异的视觉质量和渲染性能，但面临带宽消耗的挑战。此外，作为自适应体视频流的底层基础，现有基于密度的层次细节（LoD）方法并不适用于高斯表示，会导致可见间隙和严重质量退化。近期研究还探索了属性压缩技术以降低带宽消耗。我们的初步研究表明，激进的属性压缩主要导致颜色失真，而这种失真可以通过参考图像在渲染图像中有效校正。基于这些发现，我们提出了一种新颖的颜色自适应方案用于自适应体视频流，该方案采用向量量化（VQ）建立层次细节，并通过低分辨率参考图像校正颜色失真。我们进一步提出了CAGS系统，这是一个兼容多种高斯表示的自适应体视频流系统，通过流媒体服务器端渲染参考图像并在客户端执行颜色恢复来集成颜色自适应方案。在我们原型系统上的大量实验表明，在波动带宽条件下，CAGS在PSNR指标上比现有自适应流系统高出5~20 dB，运行速度显著快于现有可扩展高斯压缩方法，并能泛化到不同的高斯表示。