Immersive Video Compression using Implicit Neural Representations

Recent work on implicit neural representations (INRs) has evidenced their potential for efficiently representing and encoding conventional video content. In this paper we, for the first time, extend their application to immersive (multi-view) videos, by proposing MV-HiNeRV, a new INR-based immersive video codec. MV-HiNeRV is an enhanced version of a state-of-the-art INR-based video codec, HiNeRV, which was developed for single-view video compression. We have modified the model to learn a different group of feature grids for each view, and share the learnt network parameters among all views. This enables the model to effectively exploit the spatio-temporal and the inter-view redundancy that exists within multi-view videos. The proposed codec was used to compress multi-view texture and depth video sequences in the MPEG Immersive Video (MIV) Common Test Conditions, and tested against the MIV Test model (TMIV) that uses the VVenC video codec. The results demonstrate the superior performance of MV-HiNeRV, with significant coding gains (up to 72.33%) over TMIV. The implementation of MV-HiNeRV will be published for further development and evaluation.

翻译：近期关于隐式神经表示（INR）的研究已证明其在高效表示和编码传统视频内容方面的潜力。本文首次将其应用拓展至沉浸式（多视角）视频，提出了一种基于INR的新型沉浸式视频编解码器MV-HiNeRV。MV-HiNeRV是当前最先进的基于INR的单视角视频编解码器HiNeRV的增强版本。我们对模型进行了改造，使其为每个视角学习不同的特征网格组，并在所有视角间共享已学习的网络参数。这一设计使模型能够有效利用多视角视频中存在的时空冗余与视角间冗余。该编解码器被用于压缩MPEG沉浸式视频（MIV）通用测试条件下的多视角纹理与深度视频序列，并与采用VVenC视频编解码器的MIV测试模型（TMIV）进行了对比测试。结果表明，MV-HiNeRV展现出卓越性能，相较于TMIV实现了显著编码增益（最高达72.33%）。MV-HiNeRV的实现代码将公开发布，以支持后续开发与评估。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日