GETA-3DGS: Automatic Joint Structured Pruning and Quantization for 3D Gaussian Splatting

3D Gaussian splatting (3DGS) is a state-of-the-art representation for real-time photorealistic novel-view synthesis, yet a single high-fidelity scene typically occupies hundreds of megabytes to several gigabytes, exceeding the budgets of mobile, immersive, and volumetric video platforms. Existing 3DGS compression methods (e.g., HAC++, FlexGaussian, LP-3DGS) treat pruning, quantization, and entropy coding as separate stages and rely on hand-tuned heuristics (opacity thresholds, fixed bit-widths, SH truncation), limiting cross-scene generalization and preventing users from specifying a target rate or quality budget. We propose GETA-3DGS, to our knowledge the first end-to-end automatic joint structured pruning and quantization framework for 3DGS. Building on GETA for joint pruning-quantization of deep networks, we contribute: (i) a 3DGS-aware quantization-aware dependency graph (QADG) treating each Gaussian primitive as a group with five attribute sub-nodes and degree-aware SH sub-nodes; (ii) a render-aware saliency fusing transmittance-weighted contribution, screen-space gradient, and pixel coverage into a Gaussian-level importance score; and (iii) a heterogeneous per-attribute mixed-precision scheme co-optimized with structural sparsity under a projected partial saliency-guided (PPSG) descent guarantee. On Mip-NeRF 360, Tanks and Temples, and Deep Blending, GETA-3DGS operates directly on raw Gaussian primitives rather than a post-hoc anchor representation, delivering ~5x storage reduction over Vanilla 3DGS with no per-scene thresholds. Bit-width policy is the dominant rate-distortion lever: a uniform 6-bit cap costs up to -6.74 dB on view-dependent scenes versus our heterogeneous allocation, matching an information-theoretic reverse-water-filling analysis we develop. GETA-3DGS is complementary to existing codecs: entropy coding (HAC++, CompGS) is downstream, so the two can be composed.

翻译：三维高斯泼溅（3DGS）是一种用于实时逼真新视角合成的最先进表示方法，然而单个高保真相场通常占用数百兆字节至若干吉字节，超出移动端、沉浸式及体视频平台的资源预算。现有3DGS压缩方法（如HAC++、FlexGaussian、LP-3DGS）将剪枝、量化与熵编码视为独立阶段，并依赖人工调节的启发式策略（不透明度阈值、固定位宽、球谐函数截断），限制了跨场景泛化能力，且无法让用户指定目标码率或质量预算。我们提出GETA-3DGS——据我们所知，这是首个面向3DGS的端到端自动联合结构化剪枝与量化框架。基于用于深度网络联合剪枝-量化的GETA方法，我们贡献以下内容：（i）一种面向3DGS的量化感知依赖图（QADG），将每个高斯基元视为包含五个属性子节点及度感知球谐函数子节点的组；（ii）一种渲染感知显著性度量，融合透射率加权贡献、屏幕空间梯度与像素覆盖率，生成高斯级重要性分数；（iii）一种异构逐属性混合精度方案，在投影局部显著性引导（PPSG）下降保证下与结构化稀疏性协同优化。在Mip-NeRF 360、Tanks and Temples及Deep Blending数据集上，GETA-3DGS直接对原始高斯基元而非事后锚点表示进行操作，相比原始3DGS实现约5倍存储缩减，且无需逐场景阈值。位宽策略是主导率失真杠杆：在视角相关场景上，统一6比特上限相较我们提出的异构分配方案产生高达-6.74 dB的性能损失，这与我们发展的信息论逆向注水分析结果吻合。GETA-3DGS与现有编解码器具有互补性：熵编码（HAC++、CompGS）作为下游环节，二者可组合使用。