We study the generalization error of stochastic learning algorithms from an information-theoretic perspective, with a particular emphasis on deriving sharper bounds for differentially private algorithms. It is well known that the generalization error of stochastic learning algorithms can be bounded in terms of mutual information and maximal leakage, yielding in-expectation and high-probability guarantees, respectively. In this work, we further upper bound mutual information and maximal leakage by explicit, easily computable formulas, using typicality-based arguments and exploiting the stability properties of private algorithms. In the first part of the paper, we strictly improve the mutual-information bounds by Rodríguez-Gálvez et al. (IEEE Trans. Inf. Theory, 2021). In the second part, we derive new upper bounds on the maximal leakage of learning algorithms. In both cases, the resulting bounds on information measures translate directly into generalization error guarantees.
翻译:我们从信息论视角研究随机学习算法的泛化误差,特别关注推导差分隐私算法更紧的界。已知随机学习算法的泛化误差可通过互信息和最大泄露进行界定,分别得到期望保证和高概率保证。本文利用基于典型性的论证,并借助私有算法的稳定性特性,进一步将互信息和最大泄露上界表示为显式且易于计算的公式。在第一部分中,我们严格改进了Rodríguez-Gálvez等人(IEEE Trans. Inf. Theory, 2021)的互信息界。在第二部分中,我们推导出学习算法最大泄露的新上界。这两种情况下,所得到的信息度量界直接转化为泛化误差保证。