We study the problem of guaranteeing Differential Privacy (DP) in hyper-parameter tuning, a crucial process in machine learning involving the selection of the best run from several. Unlike many private algorithms, including the prevalent DP-SGD, the privacy implications of tuning remain insufficiently understood. Recent works propose a generic private solution for the tuning process, yet a fundamental question still persists: is the current privacy bound for this solution tight? This paper contributes both positive and negative answers to this question. Initially, we provide studies affirming the current privacy analysis is indeed tight in a general sense. However, when we specifically study the hyper-parameter tuning problem, such tightness no longer holds. This is first demonstrated by applying privacy audit on the tuning process. Our findings underscore a substantial gap between the current theoretical privacy bound and the empirical bound derived even under the strongest audit setup. The gap found is not a fluke. Our subsequent study provides an improved privacy result for private hyper-parameter tuning due to its distinct properties. Our privacy results are also more generalizable compared to prior analyses that are only easily applicable in specific setups.
翻译:我们研究了在超参数调优中保证差分隐私(DP)的问题——这是机器学习中一个关键过程,涉及从多个运行结果中选择最优项。与许多私有算法(包括流行的DP-SGD)不同,调优的隐私影响尚未得到充分理解。近期研究提出了一种针对调优过程的通用私有解决方案,但一个根本问题仍然存在:该解决方案当前的隐私界是否紧致?本文对该问题给出了正面和负面的回答。首先,我们通过研究确认,当前隐私分析在一般意义下确实是紧致的。然而,当具体研究超参数调优问题时,这种紧致性不再成立。这首先通过应用隐私审计于调优过程得以证明。我们的发现突显了当前理论隐私界与即使在最强审计设置下导出的经验界之间存在显著差距。这一差距并非偶然。随后的研究揭示了私有超参数调优由于自身独特性质而具有改进的隐私结果。与先前仅在特定设置下易用的分析相比,我们的隐私结果也具有更强的普适性。