稀疏逐点隐私泄露：机制设计与基本极限 (Sparse Point-wise Privacy Leakage: Mechanism Design and Fundamental Limits)

We study an information-theoretic privacy mechanism design problem, where an agent observes useful data $Y$ that is arbitrarily correlated with sensitive data $X$, and design disclosed data $U$ generated from $Y$ (the agent has no direct access to $X$). We introduce \emph{sparse point-wise privacy leakage}, a worst-case privacy criterion that enforces two simultaneous constraints for every disclosed symbol $u\in\mathcal{U}$: (i) $u$ may be correlated with at most $N$ realizations of $X$, and (ii) the total leakage toward those realizations is bounded. In the high-privacy regime, we use concepts from information geometry to obtain a local quadratic approximation of mutual information which measures utility between $U$ and $Y$. When the leakage matrix $P_{X|Y}$ is invertible, this approximation reduces the design problem to a sparse quadratic maximization, known as the Rayleigh-quotient problem, with an $\ell_0$ constraint. We further show that, for the approximated problem, one can without loss of optimality restrict attention to a binary released variable $U$ with a uniform distribution. For small alphabet sizes, the exact sparsity-constrained optimum can be computed via combinatorial support enumeration, which quickly becomes intractable as the dimension grows. For general dimensions, the resulting sparse Rayleigh-quotient maximization is NP-hard and closely related to sparse principal component analysis (PCA). We propose a convex semidefinite programming (SDP) relaxation that is solvable in polynomial time and provides a tractable surrogate for the NP-hard design, together with a simple rounding procedure to recover a feasible leakage direction. We also identify a sparsity threshold beyond which the sparse optimum saturates at the unconstrained spectral value and the SDP relaxation becomes tight.

翻译：我们研究一个信息论隐私机制设计问题，其中智能体观测到与敏感数据$X$任意相关的有用数据$Y$，并设计从$Y$生成的披露数据$U$（智能体无法直接访问$X$）。我们提出\emph{稀疏逐点隐私泄露}，这是一种最坏情况隐私准则，对每个披露符号$u\in\mathcal{U}$同时施加两个约束：(i) $u$最多可与$N$个$X$的实现相关；(ii) 对这些实现的总泄露量有界。在高隐私机制下，我们利用信息几何的概念获得衡量$U$与$Y$之间效用的互信息局部二次逼近。当泄露矩阵$P_{X|Y}$可逆时，该逼近将设计问题简化为具有$\ell_0$约束的稀疏二次最大化问题，即瑞利商问题。我们进一步证明，对于逼近问题，最优解可无损地简化为具有均匀分布的二元发布变量$U$。对于小字母表规模，可通过组合支撑枚举计算精确的稀疏约束最优解，但随着维度增长会迅速变得难以处理。对于一般维度，所得稀疏瑞利商最大化问题是NP难问题，且与稀疏主成分分析（PCA）密切相关。我们提出一种可在多项式时间内求解的凸半定规划（SDP）松弛方法，为NP难设计问题提供了可处理的替代方案，并辅以简单的舍入程序来恢复可行的泄露方向。我们还确定了一个稀疏度阈值，超过该阈值后稀疏最优解将饱和于无约束谱值，且SDP松弛将变得紧致。