Multi-Objective Pareto-Front Optimization for Efficient Adaptive VVC Streaming

Adaptive video streaming has facilitated improved video streaming over the past years. A balance among coding performance objectives such as bitrate, video quality, and decoding complexity is required to achieve efficient, content- and codec-dependent, adaptive video streaming. This paper proposes a multi-objective Pareto-front (PF) optimization framework to construct quality-monotonic, content-adaptive bitrate ladders Versatile Video Coding (VVC) streaming that jointly optimize video quality, bitrate, and decoding time, which is used as a practical proxy for decoding energy. Two strategies are introduced: the Joint Rate-Quality-Time Pareto Front (JRQT-PF) and the Joint Quality-Time Pareto Front (JQT-PF), each exploring different tradeoff formulations and objective prioritizations. The ladders are constructed under quality monotonicity constraints during adaptive streaming to ensure a consistent Quality of Experience (QoE). Experiments are conducted on a large-scale UHD dataset (Inter-4K), with quality assessed using PSNR, VMAF, and XPSNR, and complexity measured via decoding time and energy consumption. The JQT-PF method achieves 11.76% average bitrate savings while reducing average decoding time by 0.29% to maintain the same XPSNR, compared to a widely-used fixed ladder. More aggressive configurations yield up to 27.88% bitrate savings at the cost of increased complexity. The JRQT-PF strategy, on the other hand, offers more controlled tradeoffs, achieving 6.38 % bitrate savings and 6.17 % decoding time reduction. This framework outperforms existing methods, including fixed ladders, VMAF- and XPSNR-based dynamic resolution selection, and complexity-aware benchmarks. The results confirm that PF optimization with decoding time constraints enables sustainable, high-quality streaming tailored to network and device capabilities.

翻译：自适应视频流媒体技术在过去几年中推动了视频流媒体体验的改善。为实现高效、内容与编解码器相关的自适应视频流媒体，需要在码率、视频质量和解码复杂度等编码性能目标之间取得平衡。本文提出了一种多目标帕累托前沿优化框架，用于构建质量单调、内容自适应的Versatile Video Coding比特率阶梯，该框架联合优化视频质量、码率和解码时间（作为解码能耗的实际代理）。本文引入了两种策略：联合码率-质量-时间帕累托前沿和联合质量-时间帕累托前沿，每种策略探索了不同的权衡公式和目标优先级。在自适应流媒体过程中，比特率阶梯的构建遵循质量单调性约束，以确保一致的用户体验质量。实验在一个大规模超高清数据集上进行，使用PSNR、VMAF和XPSNR评估质量，并通过解码时间和能耗测量复杂度。与广泛使用的固定阶梯相比，JQT-PF方法在保持相同XPSNR的情况下，实现了平均11.76%的码率节省，同时平均解码时间减少了0.29%。更激进的配置可实现高达27.88%的码率节省，但代价是复杂度增加。另一方面，JRQT-PF策略提供了更可控的权衡，实现了6.38%的码率节省和6.17%的解码时间减少。该框架优于现有方法，包括固定阶梯、基于VMAF和XPSNR的动态分辨率选择以及复杂度感知基准方法。结果证实，结合解码时间约束的帕累托前沿优化能够实现根据网络和设备能力定制的、可持续的高质量流媒体。