Privacy-preserving machine learning with tensor networks

Alejandro Pozas-Kerstjens,Senaida Hernández-Santana,José Ramón Pareja Monturiol,Marco Castrillón López,Giannicola Scarpa,Carlos E. González-Guillén,David Pérez-García

from arxiv, 16 pages, 2 figures. Quantumarticle 6.1. The computational appendix is available at https://www.github.com/apozas/private-tn V3: Published version

Tensor networks, widely used for providing efficient representations of low-energy states of local quantum many-body systems, have been recently proposed as machine learning architectures which could present advantages with respect to traditional ones. In this work we show that tensor network architectures have especially prospective properties for privacy-preserving machine learning, which is important in tasks such as the processing of medical records. First, we describe a new privacy vulnerability that is present in feedforward neural networks, illustrating it in synthetic and real-world datasets. Then, we develop well-defined conditions to guarantee robustness to such vulnerability, which involve the characterization of models equivalent under gauge symmetry. We rigorously prove that such conditions are satisfied by tensor-network architectures. In doing so, we define a novel canonical form for matrix product states, which has a high degree of regularity and fixes the residual gauge that is left in the canonical forms based on singular value decompositions. We supplement the analytical findings with practical examples where matrix product states are trained on datasets of medical records, which show large reductions on the probability of an attacker extracting information about the training dataset from the model's parameters. Given the growing expertise in training tensor-network architectures, these results imply that one may not have to be forced to make a choice between accuracy in prediction and ensuring the privacy of the information processed.

翻译：张量网络作为局部量子多体系统低能态的高效表示方法被广泛应用，近期被提出作为机器学习架构，可能相较于传统架构具有优势。本文证明张量网络架构在隐私保护机器学习方面展现出特别具有前景的特性，这对医疗记录处理等任务至关重要。首先，我们描述了前馈神经网络中存在的新型隐私漏洞，并在合成数据集和真实数据集上进行了验证。随后，我们建立了明确的约束条件以保证模型对此类漏洞的鲁棒性，这些条件涉及规范对称性下等价模型的表征。我们严格证明了张量网络架构满足这些条件。在此过程中，我们为矩阵乘积态定义了一种新颖的规范形式，该形式具有高度正则性，并修正了基于奇异值分解的规范形式中残留的规范自由度。我们通过实际案例补充了理论分析，其中矩阵乘积态在医疗记录数据集上进行训练，结果显示攻击者从模型参数中提取训练数据集信息的概率大幅降低。鉴于训练张量网络架构的专业知识日益增长，这些结果表明我们可能不必在预测准确性与信息处理隐私性之间被迫做出选择。

相关内容

Machine Learning

关注 2249

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

《用于无线通信和传感的智能反射面 (IRS)》（ICC 2022）新加坡国立大学2022最新53页slides

专知会员服务

25+阅读 · 2022年11月16日

Nat. Biotechnol. | 机器学习为生物库驱动的药物发现提供动力

专知会员服务

11+阅读 · 2022年9月12日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日