It is difficult for individuals and organizations to protect personal information without a fundamental understanding of relative privacy risks. By analyzing over 5,000 empirical identity theft and fraud cases, this research identifies which types of personal data are exposed, how frequently such exposures occur, and what the consequences of those exposures are. We construct an Identity Ecosystem graph - a foundational, graph-based model in which nodes represent personally identifiable information (PII) attributes and edges represent empirical disclosure relationships between them (e.g., one PII attribute is exposed due to the exposure of another). Leveraging this graph structure, we develop a privacy risk prediction framework that uses graph theory and graph neural networks to estimate the likelihood of further disclosures when certain PII attributes are compromised. The results show that our approach effectively addresses the core question: Can the disclosure of a given identity attribute possibly lead to the disclosure of another attribute? The code for the privacy risk prediction framework is available at: https://github.com/niu-haoran/Privacy-Risk-Predictions-and-UTCID-Identity-Ecosystem.git.
翻译:若缺乏对相对隐私风险的基本理解,个人与组织将难以有效保护个人信息。本研究通过分析超过5,000个实证身份盗窃与欺诈案例,系统识别了个人数据暴露的类型、发生频率及其后果。我们构建了身份生态图——一种基于图结构的基础模型,其中节点代表个人可识别信息属性,边代表属性间实证的披露关系(例如某一PII属性因另一属性的暴露而被泄露)。基于此图结构,我们开发了隐私风险预测框架,该框架运用图论与图神经网络来评估特定PII属性泄露时引发进一步披露的可能性。结果表明,我们的方法能有效解决核心问题:给定身份属性的披露是否可能导致另一属性的泄露?隐私风险预测框架的代码已开源:https://github.com/niu-haoran/Privacy-Risk-Predictions-and-UTCID-Identity-Ecosystem.git。