广义线性赌博机中的延迟反馈再探 (Delayed Feedback in Generalised Linear Bandits Revisited) - 专知论文

会员服务 ·

0

赌博机 · 广义 · 地平线 · 近似最优 · 实验验证 ·

2023 年 4 月 6 日

Delayed Feedback in Generalised Linear Bandits Revisited

翻译：广义线性赌博机中的延迟反馈再探

Benjamin Howson,Ciara Pike-Burke,Sarah Filippi

The stochastic generalised linear bandit is a well-understood model for sequential decision-making problems, with many algorithms achieving near-optimal regret guarantees under immediate feedback. However, the stringent requirement for immediate rewards is unmet in many real-world applications where the reward is almost always delayed. We study the phenomenon of delayed rewards in generalised linear bandits in a theoretical manner. We show that a natural adaptation of an optimistic algorithm to the delayed feedback achieves a regret bound where the penalty for the delays is independent of the horizon. This result significantly improves upon existing work, where the best known regret bound has the delay penalty increasing with the horizon. We verify our theoretical results through experiments on simulated data.

翻译：随机广义线性赌博机是一种用于顺序决策问题的一个广为人知的模型，许多算法在立即反馈下实现了近似最优的遗憾保证。然而，在许多实际应用中，及时的奖励是难以实现的，奖励几乎总是被延迟的。我们以理论的方式研究了在广义线性赌博机中延迟奖励的现象。我们展示了一种自然的乐观算法在延迟反馈上的一种适应，它实现了一个遗憾绑定，其中惩罚因延迟而独立于地平线。这个结果显著提高了现有的工作，其中最好的已知遗憾约束随地平线而增加。我们通过模拟数据的实验验证了我们的理论结果。

0

相关内容

赌博机

【2023新书】使用Python进行统计和数据可视化，554页pdf

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

232+阅读 · 2022年2月3日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

12+阅读 · 2018年11月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

等离子体填充THzCherenkov源的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体环量控制对翼型气动特性的作用机理和影响规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类双曲系统的混沌及其观测器的设计

国家自然科学基金

0+阅读 · 2012年12月31日

随机环境下卡尔曼滤波器动态特性

国家自然科学基金

1+阅读 · 2012年12月31日

一类带误差模型密度函数导数的小波最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

非凸二次规划问题的低秩半定规划处理方法研究及其在信号处理中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

变分不等式及约束优化问题的迭代算法及其收敛性

国家自然科学基金

0+阅读 · 2009年12月31日

On the Minimax Regret for Online Learning with Feedback Graphs

Arxiv

0+阅读 · 2023年5月24日

Linearization Errors in Discrete Goal-Oriented Error Estimation

Arxiv

0+阅读 · 2023年5月24日

Momentum Provably Improves Error Feedback!

Arxiv

0+阅读 · 2023年5月24日

Low-Variance Forward Gradients using Direct Feedback Alignment and Momentum

Arxiv

0+阅读 · 2023年5月24日

Learning Rate Free Bayesian Inference in Constrained Domains

Arxiv

0+阅读 · 2023年5月24日

Constrained Proximal Policy Optimization

Arxiv

0+阅读 · 2023年5月23日

A reduced-order model for segregated fluid-structure interaction solvers based on an ALE approach

Arxiv

0+阅读 · 2023年5月23日

Layer Collaboration in the Forward-Forward Algorithm

Arxiv

0+阅读 · 2023年5月21日

Liquid Welfare Guarantees for No-Regret Learning in Sequential Budgeted Auctions

Arxiv

0+阅读 · 2023年5月20日

Contracted Product of Hypermatrices via STP of Matrices

Arxiv

0+阅读 · 2023年5月19日

VIP会员

文章信息

相关主题

相关VIP内容

【2023新书】使用Python进行统计和数据可视化，554页pdf

【2023新书】使用Python进行统计和数据可视化，554页pdf

专知会员服务

130+阅读 · 2023年1月29日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

105+阅读 · 2022年2月10日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

232+阅读 · 2022年2月3日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机与战争：被忽视的环境影响及无人机保护潜力》

俄罗斯规划未来无人机驱动军队

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

《人工智能、武器与影响力：前沿模型在模拟核危机中展现复杂推理》2026最新46页报告

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

12+阅读 · 2018年11月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

On the Minimax Regret for Online Learning with Feedback Graphs

Arxiv

0+阅读 · 2023年5月24日

Linearization Errors in Discrete Goal-Oriented Error Estimation

Arxiv

0+阅读 · 2023年5月24日

Momentum Provably Improves Error Feedback!

Arxiv

0+阅读 · 2023年5月24日

Low-Variance Forward Gradients using Direct Feedback Alignment and Momentum

Arxiv

0+阅读 · 2023年5月24日

Learning Rate Free Bayesian Inference in Constrained Domains

Arxiv

0+阅读 · 2023年5月24日

Constrained Proximal Policy Optimization

Arxiv

0+阅读 · 2023年5月23日

A reduced-order model for segregated fluid-structure interaction solvers based on an ALE approach

Arxiv

0+阅读 · 2023年5月23日

Layer Collaboration in the Forward-Forward Algorithm

Arxiv

0+阅读 · 2023年5月21日

Liquid Welfare Guarantees for No-Regret Learning in Sequential Budgeted Auctions

Arxiv

0+阅读 · 2023年5月20日

Contracted Product of Hypermatrices via STP of Matrices

Arxiv

0+阅读 · 2023年5月19日

相关基金

20克级的水溶性Mn-Cu-In-S磁/光双功能量子点的制备

国家自然科学基金

0+阅读 · 2015年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

等离子体填充THzCherenkov源的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体环量控制对翼型气动特性的作用机理和影响规律研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类双曲系统的混沌及其观测器的设计

国家自然科学基金

0+阅读 · 2012年12月31日

随机环境下卡尔曼滤波器动态特性

国家自然科学基金

1+阅读 · 2012年12月31日

一类带误差模型密度函数导数的小波最优估计

国家自然科学基金

0+阅读 · 2012年12月31日

非凸二次规划问题的低秩半定规划处理方法研究及其在信号处理中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于事件的强化学习及其在群机器人优化控制中的应用

国家自然科学基金

3+阅读 · 2012年12月31日

变分不等式及约束优化问题的迭代算法及其收敛性

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员