This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM). The generalized concentrability condition establishes a framework that interpolates and extends the existing hypotheses of Markov chain Hoeffding-type inequalities. The flexibility of our framework allows Hoeffding's inequality to be applied beyond the ergodic Markov chains in the traditional sense. We demonstrate the utility by applying our framework to several non-asymptotic analyses arising from the field of machine learning, including (i) a generalization bound for empirical risk minimization with Markovian samples, (ii) a finite sample guarantee for Ployak-Ruppert averaging of SGD, and (iii) a new regret bound for rested Markovian bandits with general state space.
翻译:本文研究在广义集中性条件下(通过积分概率度量定义)的马尔可夫链Hoeffding不等式。该广义集中性条件建立了一个框架,能够插值和扩展现有马尔可夫链Hoeffding型不等式的假设条件。该框架的灵活性使得Hoeffding不等式能够应用于传统意义上的遍历马尔可夫链之外的情况。我们通过将该框架应用于机器学习领域中的若干非渐近分析,展示了其实用性,包括:(i) 基于马尔可夫样本的实证风险最小化的泛化界;(ii) 随机梯度下降的Polyak-Ruppert平均的有限样本保证;(iii) 一般状态空间下的rested马尔可夫臂赌博机的新遗憾界。