Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds of the best existing algorithms for non-adaptive adversaries. For approximate differential privacy, our algorithms achieve regret bounds of $\tilde{O}(\sqrt{T \log d} + \log d/\varepsilon)$ for the stochastic setting and $\tilde{O}(\sqrt{T \log d} + T^{1/3} \log d/\varepsilon)$ for oblivious adversaries (where $d$ is the number of experts). For pure DP, our algorithms are the first to obtain sub-linear regret for oblivious adversaries in the high-dimensional regime $d \ge T$. Moreover, we prove new lower bounds for adaptive adversaries. Our results imply that unlike the non-private setting, there is a strong separation between the optimal regret for adaptive and non-adaptive adversaries for this problem. Our lower bounds also show a separation between pure and approximate differential privacy for adaptive adversaries where the latter is necessary to achieve the non-private $O(\sqrt{T})$ regret.
翻译:在线专家预测是机器学习中的一个基本问题,已有若干研究在隐私约束下对此问题进行了探讨。我们针对非自适应对手提出并分析了该问题的新算法,这些算法改进了现有最优算法的遗憾界。在近似差分隐私下,我们的算法在随机设置中实现了$\tilde{O}(\sqrt{T \log d} + \log d/\varepsilon)$的遗憾界,而对不知情对手则达到$\tilde{O}(\sqrt{T \log d} + T^{1/3} \log d/\varepsilon)$(其中$d$为专家数量)。对于纯DP,我们的算法是首个在高维场景$d \ge T$下对不知情对手获得次线性遗憾的成果。此外,我们为自适应对手证明了新的下界。结果表明,与非隐私设置不同,该问题在自适应与非自适应对手之间存在显著的最优遗憾区分。我们的下界还显示,对于自适应对手,纯DP与近似DP之间存在分离,后者是实现非隐私$O(\sqrt{T})$遗憾所必需的。