In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(\epsilon,\delta)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.
翻译:近年来,隐私保护机器学习算法因其在众多科学领域中的重要性而备受关注。然而,现有文献中的多数隐私保护算法要求学习目标具有强凸性和Lipschitz光滑性,因此无法涵盖大类的稳健损失函数(例如,分位数/最小绝对值损失)。本文旨在针对稀疏稳健回归问题,开发一种快速的隐私保护学习方案。我们的学习损失函数包含稳健最小绝对值损失和$\ell_1$稀疏惩罚项。为了在给定隐私预算下快速求解非光滑损失,我们提出了一种用于最小绝对偏差回归的快速稳健隐私保护估计(FRAPPE)算法。该算法通过将稀疏LAD问题重新表述为带惩罚的最小二乘估计问题,并采用三阶段噪声注入来保证$(\epsilon,\delta)$-差分隐私,从而实现快速估计。我们证明,与现有最先进的隐私保护回归算法相比,我们的算法能实现更好的隐私与统计精度权衡。最后,我们通过实验验证了所提出的FRAPPE算法的有效性。