We derive general bounds on the probability that the empirical first-passage time $\overline{\tau}_n\equiv \sum_{i=1}^n\tau_i/n$ of a reversible ergodic Markov process inferred from a sample of $n$ independent realizations deviates from the true mean first-passage time by more than any given amount in either direction. We construct non-asymptotic confidence intervals that hold in the elusive small-sample regime and thus fill the gap between asymptotic methods and the Bayesian approach that is known to be sensitive to prior belief and tends to underestimate uncertainty in the small-sample setting. We prove sharp bounds on extreme first-passage times that control uncertainty even in cases where the mean alone does not sufficiently characterize the statistics. Our concentration-of-measure-based results allow for model-free error control and reliable error estimation in kinetic inference, and are thus important for the analysis of experimental and simulation data in the presence of limited sampling.
翻译:我们推导了可遍历遍历马尔可夫过程的经验首次通过时间 $\overline{\tau}_n\equiv \sum_{i=1}^n\tau_i/n$(从 $n$ 次独立实现样本中推断得到)偏离真实平均首次通过时间任意给定方向或幅度的概率的一般界限。我们构建了非渐近置信区间,这些区间适用于难以处理的小样本情况,从而填补了渐近方法与贝叶斯方法之间的空白(贝叶斯方法对先验信念敏感,且在小样本情况下往往低估不确定性)。我们证明了极端首次通过时间的尖锐边界,即使在均值本身不足以充分刻画统计特性的情况下,也能控制不确定性。我们的基于测度集中的结果允许无模型错误控制和对动力学推断中可靠误差的估计,因此对于在有限采样条件下分析实验和模拟数据具有重要意义。