Optimistic Online Learning algorithms have been developed to exploit expert advices, assumed optimistically to be always useful. However, it is legitimate to question the relevance of such advices \emph{w.r.t.} the learning information provided by gradient-based online algorithms. In this work, we challenge the confidence assumption on the expert and develop the \emph{optimistically tempered} (OT) online learning framework as well as OT adaptations of online algorithms. Our algorithms come with sound theoretical guarantees in the form of dynamic regret bounds, and we eventually provide experimental validation of the usefulness of the OT approach.
翻译:乐观在线学习算法已被开发用于利用专家建议,但这些建议被一厢情愿地假定为始终有用。然而,我们有理由质疑此类建议相对于基于梯度的在线算法所提供的学习信息的相关性。在本工作中,我们挑战了对专家的置信度假设,并开发了"乐观调节"(OT)在线学习框架以及在线算法的OT适配版本。我们的算法具有以动态遗憾界形式呈现的可靠理论保证,并最终通过实验验证了OT方法的实用性。