Certifiable, adaptive uncertainty estimates for unknown quantities are an essential ingredient of sequential decision-making algorithms. Standard approaches rely on problem-dependent concentration results and are limited to a specific combination of parameterization, noise family, and estimator. In this paper, we revisit the likelihood-based inference principle and propose to use likelihood ratios to construct any-time valid confidence sequences without requiring specialized treatment in each application scenario. Our method is especially suitable for problems with well-specified likelihoods, and the resulting sets always maintain the prescribed coverage in a model-agnostic manner. The size of the sets depends on a choice of estimator sequence in the likelihood ratio. We discuss how to provably choose the best sequence of estimators and shed light on connections to online convex optimization with algorithms such as Follow-the-Regularized-Leader. To counteract the initially large bias of the estimators, we propose a reweighting scheme that also opens up deployment in non-parametric settings such as RKHS function classes. We provide a non-asymptotic analysis of the likelihood ratio confidence sets size for generalized linear models, using insights from convex duality and online learning. We showcase the practical strength of our method on generalized linear bandit problems, survival analysis, and bandits with various additive noise distributions.
翻译:可认证的自适应不确定性估计是序贯决策算法中不可或缺的组成部分。标准方法依赖于与问题相关的集中不等式,并且局限于特定的参数化形式、噪声族和估计量组合。本文重新审视基于似然的推断原理,提出利用似然比构建任意时刻有效的置信序列,而无需针对每个应用场景进行专门处理。该方法特别适用于似然函数定义良好的问题,且所得置信集能以模型无关的方式始终维持预设的覆盖概率。置信集的大小取决于似然比中估计量序列的选择。我们探讨了如何可证明地选择最优估计量序列,并阐明了该方法与跟随正则化领导者等在线凸优化算法之间的联系。为抵消估计量初始阶段的较大偏差,我们提出一种重加权方案,该方案还可应用于非参数设定(如再生核希尔伯特空间函数类)。利用凸对偶性与在线学习的理论工具,我们为广义线性模型的似然比置信集尺寸提供了非渐近分析。通过广义线性赌博机问题、生存分析以及多种加性噪声分布下的赌博机实验,充分展示了该方法在实际应用中的优势。