We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar form but bound the expected generalization error instead. The family of bounds we propose can be understood as a generalization of an expectation version of known PAC-Bayes bounds. Compared to standard PAC-Bayes bounds, the new bounds contain divergence terms that only depend on subsets (or \emph{blocks}) of the training data. The proposed MAC-Bayes bounds hold the promise of significantly improving upon the tightness of traditional PAC-Bayes and MAC-Bayes bounds. This is illustrated with a simple numerical example in which the original PAC-Bayes bound is vacuous regardless of the choice of prior, while the proposed family of bounds are finite for appropriate choices of the block size. We also explore the question whether high-probability versions of our MAC-Bayes bounds (i.e., PAC-Bayes bounds of a similar form) are possible. We answer this question in the negative with an example that shows that in general, it is not possible to establish a PAC-Bayes bound which (a) vanishes with a rate faster than $\mathcal{O}(1/\log n)$ whenever the proposed MAC-Bayes bound vanishes with rate $\mathcal{O}(n^{-1/2})$ and (b) exhibits a logarithmic dependence on the permitted error probability.
翻译:我们提出了一系列新颖的块样本MAC-Bayes(均值近似正确)泛化界。传统PAC-Bayes(概率近似正确)界通常给出高概率成立的泛化误差界,而MAC-Bayes界具有相似形式但约束的是期望泛化误差。我们所提出的界族可理解为已知PAC-Bayes界期望形式的推广。相较于标准PAC-Bayes界,新界包含仅依赖于训练数据子集(或称"块")的散度项。所提出的MAC-Bayes界有望显著提升传统PAC-Bayes界与MAC-Bayes界的紧致性。通过简单数值示例可说明:原始PAC-Bayes界在先验分布任意选择下均无意义,而所提出的界族在适当选择块大小时可得到有限值。我们还探讨了MAC-Bayes界的高概率版本(即具有相似形式的PAC-Bayes界)是否可能存在的问题。通过反例证明该问题是否定的:一般而言,当所提MAC-Bayes界以$\mathcal{O}(n^{-1/2})$速率收敛时,无法建立满足以下条件的PAC-Bayes界:(a)以快于$\mathcal{O}(1/\log n)$的速率收敛;(b)对允许误差概率呈对数依赖关系。