In HTTP adaptive live streaming applications, video segments are encoded at a fixed set of bitrate-resolution pairs known as bitrate ladder. Live encoders use the fastest available encoding configuration, referred to as preset, to ensure the minimum possible latency in video encoding. However, an optimized preset and optimized number of CPU threads for each encoding instance may result in (i) increased quality and (ii) efficient CPU utilization while encoding. For low latency live encoders, the encoding speed is expected to be more than or equal to the video framerate. To this light, this paper introduces a Just Noticeable Difference (JND)-Aware Low latency Encoding Scheme (JALE), which uses random forest-based models to jointly determine the optimized encoder preset and thread count for each representation, based on video complexity features, the target encoding speed, the total number of available CPU threads, and the target encoder. Experimental results show that, on average, JALE yield a quality improvement of 1.32 dB PSNR and 5.38 VMAF points with the same bitrate, compared to the fastest preset encoding of the HTTP Live Streaming (HLS) bitrate ladder using x265 HEVC open-source encoder with eight CPU threads used for each representation. These enhancements are achieved while maintaining the desired encoding speed. Furthermore, on average, JALE results in an overall storage reduction of 72.70 %, a reduction in the total number of CPU threads used by 63.83 %, and a 37.87 % reduction in the overall encoding time, considering a JND of six VMAF points.
翻译:在HTTP自适应直播流应用中,视频片段以固定比特率-分辨率组合(即比特率阶梯)进行编码。直播编码器采用最快的可用编码配置(称为预设),以确保视频编码的最小可能延迟。然而,针对每个编码实例优化预设及CPU线程数,可在编码过程中实现(i)质量提升与(ii)CPU利用率优化。对于低延迟直播编码器,其编码速度需达到或超过视频帧率。为此,本文提出一种基于最小可觉差(JND)感知的低延迟编码方案(JALE),该方案利用随机森林模型,根据视频复杂度特征、目标编码速度、可用CPU线程总数及目标编码器,联合确定每个表征的优化编码预设与线程数。实验结果表明,与使用x265 HEVC开源编码器、每个表征分配8个CPU线程的HLS比特率阶梯最快预设编码相比,JALE在相同比特率下平均获得1.32 dB的PSNR和5.38个VMAF点的质量提升,同时维持目标编码速度。此外,考虑6个VMAF点的JND阈值时,JALE平均实现总存储量减少72.70%、总CPU线程使用量减少63.83%、整体编码时间减少37.87%。