Multimedia compression allows us to watch videos, see pictures and hear sounds within a limited bandwidth, which helps the flourish of the internet. During the past decades, multimedia compression has achieved great success using hand-craft features and systems. With the development of artificial intelligence and video compression, there emerges a lot of research work related to using the neural network on the video compression task to get rid of the complicated system. Not only producing the advanced algorithms, but researchers also spread the compression to different content, such as User Generated Content(UGC). With the rapid development of mobile devices, screen content videos become an important part of multimedia data. In contrast, we find community lacks a large-scale dataset for screen content video compression, which impedes the fast development of the corresponding learning-based algorithms. In order to fulfill this blank and accelerate the research of this special type of videos, we propose the Large-scale Screen Content Dataset(LSCD), which contains 714 source sequences. Meanwhile, we provide the analysis of the proposed dataset to show some features of screen content videos, which will help researchers have a better understanding of how to explore new algorithms. Besides collecting and post-processing the data to organize the dataset, we also provide a benchmark containing the performance of both traditional codec and learning-based methods.
翻译:多媒体压缩技术使我们能够在有限带宽内观看视频、查看图片和收听声音,这促进了互联网的蓬勃发展。过去几十年间,利用手工设计的特征和系统,多媒体压缩取得了巨大成功。随着人工智能和视频压缩技术的发展,涌现了大量利用神经网络解决视频压缩任务以摆脱复杂系统的研究工作。研究者不仅提出了先进算法,还将压缩技术拓展至不同内容领域,如用户生成内容(UGC)。随着移动设备的快速发展,屏幕内容视频已成为多媒体数据的重要组成部分。然而,我们发现社区缺乏用于屏幕内容视频压缩的大规模数据集,这阻碍了相应学习型算法的快速发展。为填补这一空白并加速该类特殊视频的研究,我们提出了大规模屏幕内容数据集(LSCD),其中包含714个源序列。同时,我们对该数据集进行了分析以展示屏幕内容视频的某些特征,这将帮助研究者更好地理解如何探索新算法。除收集和后处理数据以构建数据集外,我们还提供了包含传统编解码器和基于学习方法性能的基准测试结果。