We investigate an attack on a machine learning model that predicts whether a person or household will relocate in the next two years, i.e., a propensity-to-move classifier. The attack assumes that the attacker can query the model to obtain predictions and that the marginal distribution of the data on which the model was trained is publicly available. The attack also assumes that the attacker has obtained the values of non-sensitive attributes for a certain number of target individuals. The objective of the attack is to infer the values of sensitive attributes for these target individuals. We explore how replacing the original data with synthetic data when training the model impacts how successfully the attacker can infer sensitive attributes.
翻译:我们研究了一种针对机器学习模型的攻击,该模型用于预测个人或家庭在未来两年内是否会搬迁,即迁移倾向分类器。攻击假设攻击者能够查询模型以获得预测结果,并且模型训练数据的边际分布是公开可用的。攻击还假设攻击者已获取一定数量目标个体的非敏感属性值。攻击的目标是推断这些目标个体的敏感属性值。我们探究了在训练模型时用合成数据替代原始数据,对攻击者成功推断敏感属性的能力产生的影响。