We study several problems in differentially private domain discovery, where each user holds a subset of items from a shared but unknown domain, and the goal is to output an informative subset of items. For set union, we show that the simple baseline Weighted Gaussian Mechanism (WGM) has a near-optimal $\ell_1$ missing mass guarantee on Zipfian data as well as a distribution-free $\ell_\infty$ missing mass guarantee. We then apply the WGM as a domain-discovery precursor for existing known-domain algorithms for private top-$k$ and $k$-hitting set and obtain new utility guarantees for their unknown domain variants. Finally, experiments demonstrate that all of our WGM-based methods are competitive with or outperform existing baselines for all three problems.
翻译:本文研究了差分隐私领域发现中的若干问题,其中每个用户持有共享但未知领域中项目的子集,目标是输出信息丰富的项目子集。针对集合并集问题,我们证明简单的基线方法加权高斯机制(WGM)在Zipf分布数据上具有近乎最优的$\ell_1$缺失质量保证,同时具备与分布无关的$\ell_\infty$缺失质量保证。随后,我们将WGM作为领域发现的前置步骤,应用于现有已知领域算法中的隐私保护top-$k$与$k$-命中集问题,为其未知领域变体获得了新的效用保证。最终,实验表明我们所有基于WGM的方法在三个问题上均与现有基线方法性能相当或更优。