Tensor networks, widely used for providing efficient representations of low-energy states of local quantum many-body systems, have been recently proposed as machine learning architectures which could present advantages with respect to traditional ones. In this work we show that tensor network architectures have especially prospective properties for privacy-preserving machine learning, which is important in tasks such as the processing of medical records. First, we describe a new privacy vulnerability that is present in feedforward neural networks, illustrating it in synthetic and real-world datasets. Then, we develop well-defined conditions to guarantee robustness to such vulnerability, which involve the characterization of models equivalent under gauge symmetry. We rigorously prove that such conditions are satisfied by tensor-network architectures. In doing so, we define a novel canonical form for matrix product states, which has a high degree of regularity and fixes the residual gauge that is left in the canonical forms based on singular value decompositions. We supplement the analytical findings with practical examples where matrix product states are trained on datasets of medical records, which show large reductions on the probability of an attacker extracting information about the training dataset from the model's parameters. Given the growing expertise in training tensor-network architectures, these results imply that one may not have to be forced to make a choice between accuracy in prediction and ensuring the privacy of the information processed.
翻译:张量网络作为局部量子多体系统低能态的高效表示方法被广泛应用,近期被提出作为机器学习架构,可能相较于传统架构具有优势。本文证明张量网络架构在隐私保护机器学习方面展现出特别具有前景的特性,这对医疗记录处理等任务至关重要。首先,我们描述了前馈神经网络中存在的新型隐私漏洞,并在合成数据集和真实数据集上进行了验证。随后,我们建立了明确的约束条件以保证模型对此类漏洞的鲁棒性,这些条件涉及规范对称性下等价模型的表征。我们严格证明了张量网络架构满足这些条件。在此过程中,我们为矩阵乘积态定义了一种新颖的规范形式,该形式具有高度正则性,并修正了基于奇异值分解的规范形式中残留的规范自由度。我们通过实际案例补充了理论分析,其中矩阵乘积态在医疗记录数据集上进行训练,结果显示攻击者从模型参数中提取训练数据集信息的概率大幅降低。鉴于训练张量网络架构的专业知识日益增长,这些结果表明我们可能不必在预测准确性与信息处理隐私性之间被迫做出选择。