PNeRV: A Polynomial Neural Representation for Videos

Extracting Implicit Neural Representations (INRs) on video data poses unique challenges due to the additional temporal dimension. In the context of videos, INRs have predominantly relied on a frame-only parameterization, which sacrifices the spatiotemporal continuity observed in pixel-level (spatial) representations. To mitigate this, we introduce Polynomial Neural Representation for Videos (PNeRV), a parameter-wise efficient, patch-wise INR for videos that preserves spatiotemporal continuity. PNeRV leverages the modeling capabilities of Polynomial Neural Networks to perform the modulation of a continuous spatial (patch) signal with a continuous time (frame) signal. We further propose a custom Hierarchical Patch-wise Spatial Sampling Scheme that ensures spatial continuity while retaining parameter efficiency. We also employ a carefully designed Positional Embedding methodology to further enhance PNeRV's performance. Our extensive experimentation demonstrates that PNeRV outperforms the baselines in conventional Implicit Neural Representation tasks like compression along with downstream applications that require spatiotemporal continuity in the underlying representation. PNeRV not only addresses the challenges posed by video data in the realm of INRs but also opens new avenues for advanced video processing and analysis.

翻译：在视频数据上提取隐式神经表示（INRs）由于额外的时间维度而面临独特挑战。在视频背景下，INRs主要依赖于仅针对帧的参数化方法，这牺牲了像素级（空间）表示中观察到的时空连续性。为缓解此问题，我们提出了用于视频的多项式神经表示（PNeRV），这是一种参数高效、基于视频块的INR方法，能够保持时空连续性。PNeRV利用多项式神经网络的建模能力，将连续空间（视频块）信号与连续时间（帧）信号进行调制。我们进一步提出了一种定制的分层视频块空间采样方案，在保持参数效率的同时确保空间连续性。我们还采用了精心设计的位置嵌入方法以进一步提升PNeRV的性能。大量实验表明，PNeRV在传统隐式神经表示任务（如压缩）以及需要底层表示具备时空连续性的下游应用中均优于基线方法。PNeRV不仅解决了INR领域中视频数据带来的挑战，还为高级视频处理与分析开辟了新途径。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日