基于轨迹级生成嵌入的观测模仿学习 (Imitation from Observations with Trajectory-Level Generative Embeddings) - 专知论文

会员服务 ·

0

嵌入 · 模仿学习 · 平滑 · 时序 · 约束 ·

Imitation from Observations with Trajectory-Level Generative Embeddings

翻译：基于轨迹级生成嵌入的观测模仿学习

Yongtao Qu,Shangzhe Li,Weitong Zhang

from arxiv, 24 pages, 6 figures, 7 tables

We consider the offline imitation learning from observations (LfO) where the expert demonstrations are scarce and the available offline suboptimal data are far from the expert behavior. Many existing distribution-matching approaches struggle in this regime because they impose strict support constraints and rely on brittle one-step models, making it hard to extract useful signal from imperfect data. To tackle this challenge, we propose TGE, a trajectory-level generative embedding for offline LfO that constructs a dense, smooth surrogate reward by estimating expert state density in the latent space of a temporal diffusion model trained on offline trajectory data. By leveraging the smooth geometry of the learned diffusion embedding, TGE captures long-horizon temporal dynamics and effectively bridges the gap between disjoint supports, ensuring a robust learning signal even when offline data is distributionally distinct from the expert. Empirically, the proposed approach consistently matches or outperforms prior offline LfO methods across a range of D4RL locomotion and manipulation benchmarks.

翻译：本文研究离线观测模仿学习问题，其中专家示范数据稀缺且可用的离线次优数据与专家行为存在显著分布差异。现有多数分布匹配方法在此场景下面临挑战，因其施加严格支撑集约束并依赖脆弱的一步预测模型，难以从非完美数据中提取有效信号。为应对该挑战，我们提出TGE——一种面向离线LfO的轨迹级生成嵌入方法，通过在时序扩散模型的隐空间内估计专家状态密度，构建稠密平滑的代理奖励函数。该方法利用所学扩散嵌入的平滑几何特性，捕捉长程时序动态特征，有效弥合非重叠支撑集间的差距，确保即使在离线数据与专家分布差异显著时仍能提供稳健的学习信号。实验表明，所提方法在D4RL运动控制与操作基准测试中持续达到或超越现有离线LfO方法的性能。

0

相关内容

多模态知识图谱表示学习综述

多模态知识图谱表示学习综述

专知会员服务

72+阅读 · 2024年7月4日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

专知会员服务

61+阅读 · 2022年3月11日

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

专知会员服务

41+阅读 · 2020年12月6日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

基于生成对抗网络的模仿学习综述, 苏州大学，计算机学报

专知会员服务

47+阅读 · 2020年2月1日

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

专知会员服务

82+阅读 · 2019年12月16日

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

专知会员服务

52+阅读 · 2019年12月16日

【报告推荐】模仿学习前沿进展，62页ppt，New Frontiers in Imitation Learning

【报告推荐】模仿学习前沿进展，62页ppt，New Frontiers in Imitation Learning

专知会员服务

39+阅读 · 2019年11月13日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

【IJCAI2020】基于生成对抗模仿学习的多模态模仿学习算法框架

【IJCAI2020】基于生成对抗模仿学习的多模态模仿学习算法框架

专知

20+阅读 · 2020年5月26日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

论文浅尝 | 使用孪生BERT网络生成句子的嵌入表示

论文浅尝 | 使用孪生BERT网络生成句子的嵌入表示

开放知识图谱

25+阅读 · 2019年10月31日

知识图谱嵌入(KGE)：方法和应用的综述

知识图谱嵌入(KGE)：方法和应用的综述

AI科技评论

10+阅读 · 2019年8月26日

论文浅尝 | 知识图谱中的链接预测：一种基于层次约束的方法

论文浅尝 | 知识图谱中的链接预测：一种基于层次约束的方法

开放知识图谱

22+阅读 · 2019年7月24日

论文浅尝 | 用于知识图中链接预测的嵌入方法 SimplE

论文浅尝 | 用于知识图中链接预测的嵌入方法 SimplE

开放知识图谱

24+阅读 · 2019年4月3日

论文浅尝 | 区分概念和实例的知识图谱嵌入方法

论文浅尝 | 区分概念和实例的知识图谱嵌入方法

开放知识图谱

17+阅读 · 2019年1月19日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

科技创新与创业

17+阅读 · 2017年11月17日

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于深度学习的海量截获卫星数据分析技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于迁移学习的图像隐写分析新方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

复杂数据模型中的分布逼近方法

国家自然科学基金

3+阅读 · 2014年12月31日

隐写模糊安全性测度及其优化嵌入算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态群稀疏约束场景知识建模的感兴趣监控目标超分辨率重建

国家自然科学基金

1+阅读 · 2014年12月31日

基于字典学习的小样本高光谱遥感图像稀疏表示分类精度研究与应用

国家自然科学基金

3+阅读 · 2014年12月31日

Multiview Self-Representation Learning across Heterogeneous Views

Arxiv

0+阅读 · 2月4日

Transfer Learning Through Conditional Quantile Matching

Arxiv

0+阅读 · 2月2日

Causal Imitation Learning Under Measurement Error and Distribution Shift

Arxiv

0+阅读 · 1月29日

bi-modal textual prompt learning for vision-language models in remote sensing

Arxiv

0+阅读 · 1月28日

Learning from Demonstrations via Capability-Aware Goal Sampling

Arxiv

0+阅读 · 1月13日

Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation

Arxiv

0+阅读 · 1月13日

Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning

Arxiv

0+阅读 · 1月13日

Real-Time Forecasting of Pathological Gait via IMU Navigation: A Few-Shot and Generative Learning Framework for Wearable Devices

Arxiv

0+阅读 · 1月2日

Deep Networks Learn Deep Hierarchical Models

Arxiv

0+阅读 · 1月1日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

VIP会员

文章信息

相关主题

相关VIP内容

多模态知识图谱表示学习综述

多模态知识图谱表示学习综述

专知会员服务

72+阅读 · 2024年7月4日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

专知会员服务

61+阅读 · 2022年3月11日

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

专知会员服务

41+阅读 · 2020年12月6日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

基于生成对抗网络的模仿学习综述, 苏州大学，计算机学报

专知会员服务

47+阅读 · 2020年2月1日

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

专知会员服务

82+阅读 · 2019年12月16日

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

【图机器学习论文】图嵌入：问题、技术与应用综述（ A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications）

专知会员服务

52+阅读 · 2019年12月16日

【报告推荐】模仿学习前沿进展，62页ppt，New Frontiers in Imitation Learning

【报告推荐】模仿学习前沿进展，62页ppt，New Frontiers in Imitation Learning

专知会员服务

39+阅读 · 2019年11月13日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机与战争：被忽视的环境影响及无人机保护潜力》

俄罗斯规划未来无人机驱动军队

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

《人工智能、武器与影响力：前沿模型在模拟核危机中展现复杂推理》2026最新46页报告

相关资讯

【IJCAI2020】基于生成对抗模仿学习的多模态模仿学习算法框架

【IJCAI2020】基于生成对抗模仿学习的多模态模仿学习算法框架

专知

20+阅读 · 2020年5月26日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

论文浅尝 | 使用孪生BERT网络生成句子的嵌入表示

论文浅尝 | 使用孪生BERT网络生成句子的嵌入表示

开放知识图谱

25+阅读 · 2019年10月31日

知识图谱嵌入(KGE)：方法和应用的综述

知识图谱嵌入(KGE)：方法和应用的综述

AI科技评论

10+阅读 · 2019年8月26日

论文浅尝 | 知识图谱中的链接预测：一种基于层次约束的方法

论文浅尝 | 知识图谱中的链接预测：一种基于层次约束的方法

开放知识图谱

22+阅读 · 2019年7月24日

论文浅尝 | 用于知识图中链接预测的嵌入方法 SimplE

论文浅尝 | 用于知识图中链接预测的嵌入方法 SimplE

开放知识图谱

24+阅读 · 2019年4月3日

论文浅尝 | 区分概念和实例的知识图谱嵌入方法

论文浅尝 | 区分概念和实例的知识图谱嵌入方法

开放知识图谱

17+阅读 · 2019年1月19日

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

【论文推荐】最新六篇视觉问答相关论文—深度嵌入学习、句子表征学习、深度特征聚合、3D匹配、细粒度文本摘要

专知

12+阅读 · 2018年6月9日

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

【论文推荐】最新5篇图像分割相关论文—条件随机场和深度特征学习、移动端网络、长期视觉定位、主动学习、主动轮廓模型、生成对抗性网络

专知

13+阅读 · 2018年1月23日

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

基于位置注意力机制模型和带标签数据来提升槽填充（EMNLP outstanding paper）

科技创新与创业

17+阅读 · 2017年11月17日

相关论文

Multiview Self-Representation Learning across Heterogeneous Views

Arxiv

0+阅读 · 2月4日

Transfer Learning Through Conditional Quantile Matching

Arxiv

0+阅读 · 2月2日

Causal Imitation Learning Under Measurement Error and Distribution Shift

Arxiv

0+阅读 · 1月29日

bi-modal textual prompt learning for vision-language models in remote sensing

Arxiv

0+阅读 · 1月28日

Learning from Demonstrations via Capability-Aware Goal Sampling

Arxiv

0+阅读 · 1月13日

Salience-SGG: Enhancing Unbiased Scene Graph Generation with Iterative Salience Estimation

Arxiv

0+阅读 · 1月13日

Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning

Arxiv

0+阅读 · 1月13日

Real-Time Forecasting of Pathological Gait via IMU Navigation: A Few-Shot and Generative Learning Framework for Wearable Devices

Arxiv

0+阅读 · 1月2日

Deep Networks Learn Deep Hierarchical Models

Arxiv

0+阅读 · 1月1日

Deep Generative Models on 3D Representations: A Survey

Arxiv

15+阅读 · 2022年10月27日

相关基金

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

分布式有监督学习的学习理论

国家自然科学基金

17+阅读 · 2015年12月31日

基于深度学习的海量截获卫星数据分析技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于迁移学习的图像隐写分析新方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

复杂数据模型中的分布逼近方法

国家自然科学基金

3+阅读 · 2014年12月31日

隐写模糊安全性测度及其优化嵌入算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

动态群稀疏约束场景知识建模的感兴趣监控目标超分辨率重建

国家自然科学基金

1+阅读 · 2014年12月31日

基于字典学习的小样本高光谱遥感图像稀疏表示分类精度研究与应用

国家自然科学基金

3+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员