In graph machine learning, data collection, sharing, and analysis often involve multiple parties, each of which may require varying levels of data security and privacy. To this end, preserving privacy is of great importance in protecting sensitive information. In the era of big data, the relationships among data entities have become unprecedentedly complex, and more applications utilize advanced data structures (i.e., graphs) that can support network structures and relevant attribute information. To date, many graph-based AI models have been proposed (e.g., graph neural networks) for various domain tasks, like computer vision and natural language processing. In this paper, we focus on reviewing privacy-preserving techniques of graph machine learning. We systematically review related works from the data to the computational aspects. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information (e.g., graph model parameters) to realize the optimization-based computation when data sharing among multiple parties is risky or impossible. In addition to discussing relevant theoretical methodology and software tools, we also discuss current challenges and highlight several possible future research opportunities for privacy-preserving graph machine learning. Finally, we envision a unified and comprehensive secure graph machine learning system.
翻译:在图机器学习中,数据收集、共享与分析往往涉及多方参与者,每方对数据安全与隐私保护的需求各不相同。因此,隐私保护对于敏感信息的防护至关重要。在大数据时代,数据实体间的关系变得空前复杂,越来越多的应用采用能够支持网络结构及关联属性信息的先进数据结构(即图)。迄今为止,许多基于图的人工智能模型(如图神经网络)已被提出,用于计算机视觉和自然语言处理等各类领域任务。本文聚焦于综述图机器学习中的隐私保护技术,系统性地回顾了从数据层面到计算层面的相关工作。首先,我们回顾了生成隐私保护图数据的方法;随后,描述了在多方间数据共享存在风险或不可行时,传输隐私保护信息(如图模型参数)以实现基于优化的计算方法。除讨论相关理论方法与软件工具外,我们还探讨了当前面临的挑战,并指出隐私保护图机器学习未来可能的研究方向。最后,我们展望了一个统一且全面的安全图机器学习系统。