A private compression design problem is studied, where an encoder observes useful data $Y$, wishes to compress it using variable length code and communicates it through an unsecured channel. Since $Y$ is correlated with private data $X$, the encoder uses a private compression mechanism to design encoded message $\cal C$ and sends it over the channel. An adversary is assumed to have access to the output of the encoder, i.e., $\cal C$, and tries to estimate $X$. Furthermore, it is assumed that both encoder and decoder have access to a shared secret key $W$. In this work, we generalize the perfect privacy (secrecy) assumption and consider a non-zero leakage between the private data $X$ and encoded message $\cal C$. The design goal is to encode message $\cal C$ with minimum possible average length that satisfies non-perfect privacy constraints. We find upper and lower bounds on the average length of the encoded message using different privacy metrics and study them in special cases. For the achievability we use two-part construction coding and extended versions of Functional Representation Lemma. Lastly, in an example we show that the bounds can be asymptotically tight.
翻译:本文研究了一个私有压缩设计问题:编码器观测到有用数据 $Y$,希望使用变长编码对其进行压缩,并通过不安全信道传输。由于 $Y$ 与私有数据 $X$ 相关,编码器采用私有压缩机制设计编码消息 $\cal C$ 并发送至信道。假设攻击者可获取编码器输出(即 $\cal C$)并尝试估计 $X$。进一步,假设编码器和解码器共享一个秘密密钥 $W$。本文推广了完美隐私(保密性)假设,考虑私有数据 $X$ 与编码消息 $\cal C$ 之间存在非零泄露。设计目标是在满足非完美隐私约束的前提下,以最小可能平均长度编码消息 $\cal C$。我们利用不同隐私度量给出了编码消息平均长度的上界和下界,并研究了其特殊情况下的表现。在可达性证明中,我们采用了两部分构造编码和函数表示引理的扩展版本。最后,通过示例表明这些界在渐近意义下是紧的。