Machine learning models are often personalized based on information that is protected, sensitive, self-reported, or costly to acquire. These models use information about people, but do not facilitate nor inform their \emph{consent}. Individuals cannot opt out of reporting information that a model needs to personalize their predictions, nor tell if they would benefit from personalization in the first place. In this work, we introduce a new family of prediction models, called \emph{participatory systems}, that allow individuals to opt into personalization at prediction time. We present a model-agnostic algorithm to learn participatory systems for supervised learning tasks where models are personalized with categorical group attributes. We conduct a comprehensive empirical study of participatory systems in clinical prediction tasks, comparing them to common approaches for personalization and imputation. Our results demonstrate that participatory systems can facilitate and inform consent in a way that improves performance and privacy across all groups who report personal data.
翻译:机器学习模型通常基于受保护、敏感、自我报告或获取成本较高的信息进行个性化。这些模型使用个人信息,但既未促进也未告知用户的*同意*。用户既无法选择不提供模型个性化预测所需的信息,也无法事先判断自己是否会从个性化中受益。为此,我们提出一类新的预测模型——*参与式系统*,允许用户在预测时刻自主选择是否参与个性化。我们提出了一种与具体模型无关的算法,用于学习面向监督学习任务的参与式系统,此类任务中模型通过分类群体属性实现个性化。我们在临床预测任务中对参与式系统进行了全面的实证研究,并将其与常见的个性化和缺失值处理方法进行比较。结果表明,参与式系统能够以促进绩效提升并保护所有报告个人信息群体隐私的方式,实现并告知用户同意机制。