The promise of least-privilege learning -- to find feature representations that are useful for a learning task but prevent inference of any sensitive information unrelated to this task -- is highly appealing. However, so far this concept has only been stated informally. It thus remains an open question whether and how we can achieve this goal. In this work, we provide the first formalisation of the least-privilege principle for machine learning and characterise its feasibility. We prove that there is a fundamental trade-off between a representation's utility for a given task and its leakage beyond the intended task: it is not possible to learn representations that have high utility for the intended task but, at the same time prevent inference of any attribute other than the task label itself. This trade-off holds under realistic assumptions on the data distribution and regardless of the technique used to learn the feature mappings that produce these representations. We empirically validate this result for a wide range of learning techniques, model architectures, and datasets.
翻译:最小特权学习的目标是寻找对学习任务有用但能防止推断与该任务无关的任何敏感信息的特征表示,这一承诺极具吸引力。然而,迄今为止这一概念仅以非正式方式阐述。因此,我们能否以及如何实现这一目标仍是一个悬而未决的问题。在本工作中,我们首次对机器学习中的最小特权原则进行了形式化,并刻画了其可行性。我们证明,在给定任务的表示效用与其超出预期任务的泄露之间存在根本性的权衡:不可能学习出既对预期任务具有高效用,又能防止推断除任务标签本身以外的任何属性的表示。这一权衡在数据分布的现实假设下成立,且与用于学习产生这些表示的特征映射的技术无关。我们通过大量学习技术、模型架构和数据集对这一结果进行了实证验证。