We revisit the Bayesian Context Trees (BCT) modelling framework for discrete time series, which was recently found to be very effective in numerous tasks including model selection, estimation and prediction. A novel representation of the induced posterior distribution on model space is derived in terms of a simple branching process, and several consequences of this are explored in theory and in practice. First, it is shown that the branching process representation leads to a simple variable-dimensional Monte Carlo sampler for the joint posterior distribution on models and parameters, which can efficiently produce independent samples. This sampler is found to be more efficient than earlier MCMC samplers for the same tasks. Then, the branching process representation is used to establish the asymptotic consistency of the BCT posterior, including the derivation of an almost-sure convergence rate. Finally, an extensive study is carried out on the performance of the induced Bayesian entropy estimator. Its utility is illustrated through both simulation experiments and real-world applications, where it is found to outperform several state-of-the-art methods.
翻译:本文重新审视了面向离散时间序列的贝叶斯上下文树(BCT)建模框架。该框架近期被发现在包括模型选择、估计与预测在内的诸多任务中表现优异。我们通过一种简单的分支过程推导出模型空间上诱导后验分布的新表示,并从理论与应用两方面深入探讨该表示的若干推论。首先证明分支过程表示能够导出对模型与参数联合后验分布的简单变维蒙特卡洛抽样器,该抽样器可高效生成独立样本,且性能优于此前同类任务的MCMC抽样方法。其次,利用分支过程表示建立BCT后验的渐近一致性,包括几乎必然收敛速率的推导。最终对诱导贝叶斯熵估计器的性能开展系统性研究。通过模拟实验与真实应用场景表明,该估计器在多项指标上优于若干前沿方法。