从路径特征到序列建模：离线强化学习中的增量特征贡献方法 (From Path Signatures to Sequential Modeling: Incremental Signature Contributions for Offline RL) - 专知论文

会员服务 ·

0

路径 · 序列 · ISC · 表示 · 结构 ·

From Path Signatures to Sequential Modeling: Incremental Signature Contributions for Offline RL

翻译：从路径特征到序列建模：离线强化学习中的增量特征贡献方法

Ziyi Zhao,Qingchuan Li,Yuxuan Xu

Path signatures embed trajectories into tensor algebra and constitute a universal, non-parametric representation of paths; however, in the standard form, they collapse temporal structure into a single global object, which limits their suitability for decision-making problems that require step-wise reactivity. We propose the Incremental Signature Contribution (ISC) method, which decomposes truncated path signatures into a temporally ordered sequence of elements in the tensor-algebra space, corresponding to incremental contributions induced by last path increments. This reconstruction preserves the algebraic structure and expressivity of signatures, while making their internal temporal evolution explicit, enabling processing signature-based representations via sequential modeling approaches. In contrast to full signatures, ISC is inherently sensitive to instantaneous trajectory updates, which is critical for sensitive and stability-requiring control dynamics. Building on this representation, we introduce ISC-Transformer (ISCT), an offline reinforcement learning model that integrates ISC into a standard Transformer architecture without further architectural modification. We evaluate ISCT on HalfCheetah, Walker2d, Hopper, and Maze2d, including settings with delayed rewards and downgraded datasets. The results demonstrate that ISC method provides a theoretically grounded and practically effective alternative to path processing for temporally sensitive control tasks.

翻译：路径特征将轨迹嵌入张量代数，构成路径的通用非参数化表示；然而，在标准形式中，它们将时间结构压缩为单一全局对象，这限制了其在需要逐步响应性的决策问题中的适用性。我们提出增量特征贡献方法，该方法将截断路径特征分解为张量代数空间中按时间顺序排列的元素序列，对应由最近路径增量引起的增量贡献。这种重构在保持特征代数结构与表达力的同时，使其内部时间演化过程显式化，从而能够通过序列建模方法处理基于特征的表示。与完整特征相比，ISC方法对瞬时轨迹更新具有内在敏感性，这对于敏感且要求稳定性的控制动力学至关重要。基于此表示，我们提出ISC-Transformer模型，这是一种将ISC集成到标准Transformer架构中且无需额外结构修改的离线强化学习模型。我们在HalfCheetah、Walker2d、Hopper和Maze2d环境中评估ISCT模型，包括含延迟奖励和降级数据集的设定。结果表明，对于时间敏感的控制任务，ISC方法为路径处理提供了理论严谨且实际有效的替代方案。

0

相关内容

基于表征学习的离线强化学习方法研究综述

基于表征学习的离线强化学习方法研究综述

专知会员服务

29+阅读 · 2024年7月2日

基于模型的强化学习综述

基于模型的强化学习综述

专知会员服务

48+阅读 · 2023年1月9日

《综述：强化学习在航空中的应用》第一份调查航空领域RL方法的研究论文，2022最新论文

《综述：强化学习在航空中的应用》第一份调查航空领域RL方法的研究论文，2022最新论文

专知会员服务

49+阅读 · 2022年11月15日

【MIT博士论文】通过奇异值分解、端到端基于模型的方法和奖励塑造的有效强化学习

【MIT博士论文】通过奇异值分解、端到端基于模型的方法和奖励塑造的有效强化学习

专知会员服务

49+阅读 · 2022年9月22日

万字长文！离线强化学习(OfflineRL)总结(原理、数据集、算法、复杂性分析、超参数调优等）

万字长文！离线强化学习(OfflineRL)总结(原理、数据集、算法、复杂性分析、超参数调优等）

专知会员服务

42+阅读 · 2022年5月12日

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

专知会员服务

45+阅读 · 2022年3月6日

【康奈尔大学-Facebook】特征归一化与数据增强，Feature Normalization

【康奈尔大学-Facebook】特征归一化与数据增强，Feature Normalization

专知会员服务

57+阅读 · 2020年3月9日

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

专知会员服务

10+阅读 · 2020年1月16日

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

专知会员服务

35+阅读 · 2019年12月1日

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

专知会员服务

63+阅读 · 2019年10月26日

【AAAI2023】用于图对比学习的谱特征增强

【AAAI2023】用于图对比学习的谱特征增强

专知

20+阅读 · 2022年12月11日

推荐：使用Python实现机器学习特征选择的4种方法（附代码）

推荐：使用Python实现机器学习特征选择的4种方法（附代码）

数据分析

12+阅读 · 2019年4月14日

独家 | 使用Python实现机器学习特征选择的4种方法（附代码）

独家 | 使用Python实现机器学习特征选择的4种方法（附代码）

数据派THU

12+阅读 · 2019年4月12日

用 LDA 和 LSA 两种方法来降维和做 Topic 建模

用 LDA 和 LSA 两种方法来降维和做 Topic 建模

AI研习社

13+阅读 · 2018年8月24日

【知识图谱】如何将知识图谱特征学习应用到推荐系统？

【知识图谱】如何将知识图谱特征学习应用到推荐系统？

产业智能官

28+阅读 · 2018年6月14日

如何将知识图谱特征学习应用到推荐系统？

如何将知识图谱特征学习应用到推荐系统？

AI100

16+阅读 · 2018年6月10日

一文看懂常用特征工程方法

一文看懂常用特征工程方法

AI研习社

17+阅读 · 2018年5月2日

图上的归纳表示学习

图上的归纳表示学习

科技创新与创业

23+阅读 · 2017年11月9日

推荐中的序列化建模：Session-based neural recommendation

推荐中的序列化建模：Session-based neural recommendation

机器学习研究会

18+阅读 · 2017年11月5日

特征工程的特征理解（一）

特征工程的特征理解（一）

机器学习研究会

10+阅读 · 2017年10月23日

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

42+阅读 · 2015年12月31日

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于非线性流形学习的极化SAR特征提取与匹配技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于检索优化的三维特征建模方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

变指数模化空间的特征及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

强非线性偏微分方程基于梯度重构的新型算法

国家自然科学基金

0+阅读 · 2014年12月31日

分数阶偏微分方程与近场动力学等非局部模型的高保真快速算法与数值分析

国家自然科学基金

1+阅读 · 2014年12月31日

Generalised Linear Models Driven by Latent Processes: Asymptotic Theory and Applications

Arxiv

0+阅读 · 2月18日

Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization

Arxiv

0+阅读 · 2月16日

A quantum-inspired multi-level tensor-train monolithic space-time method for nonlinear PDEs

Arxiv

0+阅读 · 2月8日

Reuse your FLOPs: Scaling RL on Hard Problems by Conditioning on Very Off-Policy Prefixes

Arxiv

0+阅读 · 2月3日

Learning-based Initialization of Trajectory Optimization for Path-following Problems of Redundant Manipulators

Arxiv

0+阅读 · 2月3日

Chain-of-Goals Hierarchical Policy for Long-Horizon Offline Goal-Conditioned RL

Arxiv

0+阅读 · 2月3日

Trajectory Data Management and Mining: A Survey from Deep Learning to the LLM Era

Arxiv

0+阅读 · 1月31日

PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

Arxiv

0+阅读 · 1月30日

Representation-Driven Reinforcement Learning

Arxiv

0+阅读 · 1月22日

Vehicle Routing with Finite Time Horizon using Deep Reinforcement Learning with Improved Network Embedding

Arxiv

0+阅读 · 1月21日

VIP会员

文章信息

相关主题

相关VIP内容

基于表征学习的离线强化学习方法研究综述

基于表征学习的离线强化学习方法研究综述

专知会员服务

29+阅读 · 2024年7月2日

基于模型的强化学习综述

基于模型的强化学习综述

专知会员服务

48+阅读 · 2023年1月9日

《综述：强化学习在航空中的应用》第一份调查航空领域RL方法的研究论文，2022最新论文

《综述：强化学习在航空中的应用》第一份调查航空领域RL方法的研究论文，2022最新论文

专知会员服务

49+阅读 · 2022年11月15日

【MIT博士论文】通过奇异值分解、端到端基于模型的方法和奖励塑造的有效强化学习

【MIT博士论文】通过奇异值分解、端到端基于模型的方法和奖励塑造的有效强化学习

专知会员服务

49+阅读 · 2022年9月22日

万字长文！离线强化学习(OfflineRL)总结(原理、数据集、算法、复杂性分析、超参数调优等）

万字长文！离线强化学习(OfflineRL)总结(原理、数据集、算法、复杂性分析、超参数调优等）

专知会员服务

42+阅读 · 2022年5月12日

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

专知会员服务

45+阅读 · 2022年3月6日

【康奈尔大学-Facebook】特征归一化与数据增强，Feature Normalization

【康奈尔大学-Facebook】特征归一化与数据增强，Feature Normalization

专知会员服务

57+阅读 · 2020年3月9日

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

可视化特征属性基线的影响，Visualizing the Impact of Feature Attribution Baselines

专知会员服务

10+阅读 · 2020年1月16日

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

【ECML-PKDD 2019】序列和时间序列学习的有效线性模型（Effective Linear Models for Learning with Sequences and Time Series），Georgiana Ifrim

专知会员服务

35+阅读 · 2019年12月1日

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

专知会员服务

63+阅读 · 2019年10月26日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基于自适应表征的高效视觉建模

《多域作战中融合网络、电子战与动能机动》

AI智能体时代大模型安全风险与攻防新挑战

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

相关资讯

【AAAI2023】用于图对比学习的谱特征增强

【AAAI2023】用于图对比学习的谱特征增强

专知

20+阅读 · 2022年12月11日

推荐：使用Python实现机器学习特征选择的4种方法（附代码）

推荐：使用Python实现机器学习特征选择的4种方法（附代码）

数据分析

12+阅读 · 2019年4月14日

独家 | 使用Python实现机器学习特征选择的4种方法（附代码）

独家 | 使用Python实现机器学习特征选择的4种方法（附代码）

数据派THU

12+阅读 · 2019年4月12日

用 LDA 和 LSA 两种方法来降维和做 Topic 建模

用 LDA 和 LSA 两种方法来降维和做 Topic 建模

AI研习社

13+阅读 · 2018年8月24日

【知识图谱】如何将知识图谱特征学习应用到推荐系统？

【知识图谱】如何将知识图谱特征学习应用到推荐系统？

产业智能官

28+阅读 · 2018年6月14日

如何将知识图谱特征学习应用到推荐系统？

如何将知识图谱特征学习应用到推荐系统？

AI100

16+阅读 · 2018年6月10日

一文看懂常用特征工程方法

一文看懂常用特征工程方法

AI研习社

17+阅读 · 2018年5月2日

图上的归纳表示学习

图上的归纳表示学习

科技创新与创业

23+阅读 · 2017年11月9日

推荐中的序列化建模：Session-based neural recommendation

推荐中的序列化建模：Session-based neural recommendation

机器学习研究会

18+阅读 · 2017年11月5日

特征工程的特征理解（一）

特征工程的特征理解（一）

机器学习研究会

10+阅读 · 2017年10月23日

相关论文

Generalised Linear Models Driven by Latent Processes: Asymptotic Theory and Applications

Arxiv

0+阅读 · 2月18日

Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization

Arxiv

0+阅读 · 2月16日

A quantum-inspired multi-level tensor-train monolithic space-time method for nonlinear PDEs

Arxiv

0+阅读 · 2月8日

Reuse your FLOPs: Scaling RL on Hard Problems by Conditioning on Very Off-Policy Prefixes

Arxiv

0+阅读 · 2月3日

Learning-based Initialization of Trajectory Optimization for Path-following Problems of Redundant Manipulators

Arxiv

0+阅读 · 2月3日

Chain-of-Goals Hierarchical Policy for Long-Horizon Offline Goal-Conditioned RL

Arxiv

0+阅读 · 2月3日

Trajectory Data Management and Mining: A Survey from Deep Learning to the LLM Era

Arxiv

0+阅读 · 1月31日

PlatoLTL: Learning to Generalize Across Symbols in LTL Instructions for Multi-Task RL

Arxiv

0+阅读 · 1月30日

Representation-Driven Reinforcement Learning

Arxiv

0+阅读 · 1月22日

Vehicle Routing with Finite Time Horizon using Deep Reinforcement Learning with Improved Network Embedding

Arxiv

0+阅读 · 1月21日

相关基金

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

42+阅读 · 2015年12月31日

基于重要性采样的并行离策略强化学习方法研究

国家自然科学基金

23+阅读 · 2015年12月31日

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于非线性流形学习的极化SAR特征提取与匹配技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于检索优化的三维特征建模方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

变指数模化空间的特征及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

强非线性偏微分方程基于梯度重构的新型算法

国家自然科学基金

0+阅读 · 2014年12月31日

分数阶偏微分方程与近场动力学等非局部模型的高保真快速算法与数值分析

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员