Towards Scalable Neural Representation for Diverse Videos - 专知论文

会员服务 ·

0

视频 · 表示 · 冗余 · 数据集 · 联合编码 ·

2023 年 3 月 24 日

Towards Scalable Neural Representation for Diverse Videos

翻译：面向多样视频的可扩展神经表示

Bo He,Xitong Yang,Hanyu Wang,Zuxuan Wu,Hao Chen,Shuaiyi Huang,Yixuan Ren,Ser-Nam Lim,Abhinav Shrivastava

from arxiv, Accepted at CVPR 2023

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e.g., NeRV, E-NeRV). While achieving promising results, existing INR-based methods are limited to encoding a handful of short videos (e.g., seven 5-second videos in the UVG dataset) with redundant visual content, leading to a model design that fits individual video frames independently and is not efficiently scalable to a large number of diverse videos. This paper focuses on developing neural representations for a more practical setup -- encoding long and/or a large number of videos with diverse visual content. We first show that instead of dividing videos into small subsets and encoding them with separate models, encoding long and diverse videos jointly with a unified model achieves better compression results. Based on this observation, we propose D-NeRV, a novel neural representation framework designed to encode diverse videos by (i) decoupling clip-specific visual content from motion information, (ii) introducing temporal reasoning into the implicit neural network, and (iii) employing the task-oriented flow as intermediate output to reduce spatial redundancies. Our new model largely surpasses NeRV and traditional video compression techniques on UCF101 and UVG datasets on the video compression task. Moreover, when used as an efficient data-loader, D-NeRV achieves 3%-10% higher accuracy than NeRV on action recognition tasks on the UCF101 dataset under the same compression ratios.

翻译：隐式神经表征（INR）在三维场景和图像表示领域日益受到关注，并最近被应用于视频编码（如NeRV、E-NeRV）。尽管取得了令人瞩目的成果，现有基于INR的方法仅能编码少量内容冗余的短视频（例如 UVG 数据集中的七个5秒视频），其模型设计独立拟合单个视频帧，无法高效扩展到大量多样视频。本文致力于为更实际的场景开发神经表示——编码包含丰富视觉内容的长视频和/或大量视频。我们首先证明：与将视频分割为小子集并用独立模型编码相比，用统一模型联合编码长视频和多样视频可获得更优的压缩效果。基于这一发现，我们提出D-NeRV——一种专为编码多样视频设计的新型神经表示框架，其核心创新包括：（i）解耦片段特定视觉内容与运动信息，（ii）在隐式神经网络中引入时序推理能力，以及（iii）采用任务导向光流作为中间输出以减少空间冗余。在UCF101和UVG数据集上的视频压缩任务中，我们的新模型大幅超越NeRV及传统视频压缩技术。此外，当作为高效数据加载器时，在相同压缩比下，D-NeRV在UCF101数据集的动作识别任务上比NeRV的准确率提升3%-10%。

4

相关内容

视频

【CVPR2023】面向不同视频的可扩展神经表示，

【CVPR2023】面向不同视频的可扩展神经表示，

专知会员服务

20+阅读 · 2023年3月28日

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

专知会员服务

22+阅读 · 2022年7月7日

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【AAAI 2022】跨模态目标跟踪: 模态感知表示和统一基准

【AAAI 2022】跨模态目标跟踪: 模态感知表示和统一基准

专知会员服务

44+阅读 · 2022年1月6日

【NeurIPS2021】NeRV:视频的神经表示

【NeurIPS2021】NeRV:视频的神经表示

专知会员服务

12+阅读 · 2021年10月28日

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

专知会员服务

60+阅读 · 2020年7月6日

【SIGIR2020-中科院】TAGNN: 基于会话推荐的目标注意力图神经网络，TAGNN: Target Attentive Graph Neural Networks for Session-based Recommendation

【SIGIR2020-中科院】TAGNN: 基于会话推荐的目标注意力图神经网络，TAGNN: Target Attentive Graph Neural Networks for Session-based Recommendation

专知会员服务

42+阅读 · 2020年5月10日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

跨域推荐最新前沿工作进展汇总

跨域推荐最新前沿工作进展汇总

机器学习与推荐算法

0+阅读 · 2022年9月29日

最新10篇对比学习推荐前沿工作

最新10篇对比学习推荐前沿工作

机器学习与推荐算法

2+阅读 · 2022年9月14日

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

专知

1+阅读 · 2022年7月7日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

专知

54+阅读 · 2020年3月12日

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

泡泡机器人SLAM

22+阅读 · 2018年12月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

基于多源视频的大范围场景目标跟踪

国家自然科学基金

2+阅读 · 2015年12月31日

面向点击与视觉特征融合的结构化图像排序方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

面向用户意图的行为轨迹搜索与推荐系统

国家自然科学基金

4+阅读 · 2013年12月31日

面向稀疏矩阵和图计算的自适应优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

ICF中电子/离子输运的PIC-FLUID混合模拟方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

HEVC标准框架下面向复合内容的屏幕视频编码

国家自然科学基金

0+阅读 · 2012年12月31日

云环境下高效视频共享和网络传输

国家自然科学基金

0+阅读 · 2011年12月31日

基于压缩采样的低复杂度视频编码理论与技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于分布Maple系统下的吴方法的并行计算

国家自然科学基金

1+阅读 · 2009年12月31日

视频选择性注意机理与语义特征提取

国家自然科学基金

1+阅读 · 2009年12月31日

Analyzing Deep Learning Representations of Point Clouds for Real-Time In-Vehicle LiDAR Perception

Arxiv

0+阅读 · 2023年5月15日

Gradient-enhanced physics-informed neural networks based on transfer learning for inverse problems of the variable coefficient differential equations

Arxiv

0+阅读 · 2023年5月15日

BundleRecon: Ray Bundle-Based 3D Neural Reconstruction

Arxiv

0+阅读 · 2023年5月12日

Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning

Arxiv

0+阅读 · 2023年5月10日

Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering

Arxiv

0+阅读 · 2023年5月10日

ProtGNN: Towards Self-Explaining Graph Neural Networks

Arxiv

22+阅读 · 2021年12月2日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

最新内容

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

4+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

5+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

5+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

2+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

3+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

11+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

13+阅读 · 7月16日

美陆军任务式指挥人工智能解决方案

美陆军任务式指挥人工智能解决方案

专知会员服务

13+阅读 · 7月16日

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

ICML 2026 | 理论级自动形式化：从孤立命题到统一形式化知识库

专知会员服务

9+阅读 · 7月16日

综述 | 现代智能体自我改进，从模型更新到脚手架演化

综述 | 现代智能体自我改进，从模型更新到脚手架演化

专知会员服务

15+阅读 · 7月16日

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

美国陆军宣布“项目融合-顶点6”：现代化进程的关键里程碑

专知会员服务

13+阅读 · 7月15日

相关VIP内容

【CVPR2023】面向不同视频的可扩展神经表示，

【CVPR2023】面向不同视频的可扩展神经表示，

专知会员服务

20+阅读 · 2023年3月28日

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

专知会员服务

22+阅读 · 2022年7月7日

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【AAAI 2022】跨模态目标跟踪: 模态感知表示和统一基准

【AAAI 2022】跨模态目标跟踪: 模态感知表示和统一基准

专知会员服务

44+阅读 · 2022年1月6日

【NeurIPS2021】NeRV:视频的神经表示

【NeurIPS2021】NeRV:视频的神经表示

专知会员服务

12+阅读 · 2021年10月28日

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

【2020 最新论文】节点邻近的图池化的层次表示学习 Graph Pooling with Node Proximity for Hierarchical Representation Learning

专知会员服务

43+阅读 · 2020年7月19日

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

【SIGIR2020-NUS】解缠图协同过滤，Disentangled Graph Collaborative Filtering

专知会员服务

60+阅读 · 2020年7月6日

【SIGIR2020-中科院】TAGNN: 基于会话推荐的目标注意力图神经网络，TAGNN: Target Attentive Graph Neural Networks for Session-based Recommendation

【SIGIR2020-中科院】TAGNN: 基于会话推荐的目标注意力图神经网络，TAGNN: Target Attentive Graph Neural Networks for Session-based Recommendation

专知会员服务

42+阅读 · 2020年5月10日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

热门VIP内容

开通专知VIP会员享更多权益服务

《边缘端实时无线感知赋能现场多机器人部署》200页

人工智能赋能战场情报：提速决策进程

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

战力倍增器：自主武器系统与乌克兰及加沙冲突

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

跨域推荐最新前沿工作进展汇总

跨域推荐最新前沿工作进展汇总

机器学习与推荐算法

0+阅读 · 2022年9月29日

最新10篇对比学习推荐前沿工作

最新10篇对比学习推荐前沿工作

机器学习与推荐算法

2+阅读 · 2022年9月14日

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

计算机图形学顶会SIGGRAPH 2022最佳论文奖出炉！英伟达等五篇论文斩获！

专知

1+阅读 · 2022年7月7日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

专知

54+阅读 · 2020年3月12日

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

【泡泡一分钟】用于RGBD语义分割的三维图神经网络(ICCV2017-546)

泡泡机器人SLAM

22+阅读 · 2018年12月4日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

相关论文

Analyzing Deep Learning Representations of Point Clouds for Real-Time In-Vehicle LiDAR Perception

Arxiv

0+阅读 · 2023年5月15日

Gradient-enhanced physics-informed neural networks based on transfer learning for inverse problems of the variable coefficient differential equations

Arxiv

0+阅读 · 2023年5月15日

BundleRecon: Ray Bundle-Based 3D Neural Reconstruction

Arxiv

0+阅读 · 2023年5月12日

Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning

Arxiv

0+阅读 · 2023年5月10日

Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering

Arxiv

0+阅读 · 2023年5月10日

ProtGNN: Towards Self-Explaining Graph Neural Networks

Arxiv

22+阅读 · 2021年12月2日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

基于多源视频的大范围场景目标跟踪

国家自然科学基金

2+阅读 · 2015年12月31日

面向点击与视觉特征融合的结构化图像排序方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

面向用户意图的行为轨迹搜索与推荐系统

国家自然科学基金

4+阅读 · 2013年12月31日

面向稀疏矩阵和图计算的自适应优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

ICF中电子/离子输运的PIC-FLUID混合模拟方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

HEVC标准框架下面向复合内容的屏幕视频编码

国家自然科学基金

0+阅读 · 2012年12月31日

云环境下高效视频共享和网络传输

国家自然科学基金

0+阅读 · 2011年12月31日

基于压缩采样的低复杂度视频编码理论与技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于分布Maple系统下的吴方法的并行计算

国家自然科学基金

1+阅读 · 2009年12月31日

视频选择性注意机理与语义特征提取

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员