With careful manipulation, malicious agents can reverse engineer private information encoded in pre-trained language models. Security concerns motivate the development of quantum pre-training. In this work, we propose a highly Portable Quantum Language Model (PQLM) that can easily transmit information to downstream tasks on classical machines. The framework consists of a cloud PQLM built with random Variational Quantum Classifiers (VQC) and local models for downstream applications. We demonstrate the ad hoc portability of the quantum model by extracting only the word embeddings and effectively applying them to downstream tasks on classical machines. Our PQLM exhibits comparable performance to its classical counterpart on both intrinsic evaluation (loss, perplexity) and extrinsic evaluation (multilingual sentiment analysis accuracy) metrics. We also perform ablation studies on the factors affecting PQLM performance to analyze model stability. Our work establishes a theoretical foundation for a portable quantum pre-trained language model that could be trained on private data and made available for public use with privacy protection guarantees.
翻译:通过精心操控,恶意代理能够逆向工程出预训练语言模型中编码的私有信息。安全问题促使了量子预训练的发展。本文提出了一种高度便携的量子语言模型(PQLM),该模型能够便捷地将信息传输至经典计算机上的下游任务。该框架由基于随机变分量子分类器(VQC)构建的云端PQLM以及用于下游应用的本地模型组成。我们通过仅提取词嵌入并将其有效应用于经典机器上的下游任务,展示了该量子模型的即插即用便携性。我们的PQLM在内在评估(损失、困惑度)和外在评估(多语言情感分析准确率)指标上均展现出与经典模型相当的性能。我们还对影响PQLM性能的因素进行了消融研究,以分析模型稳定性。本研究为便携式量子预训练语言模型奠定了理论基础,该模型可在私有数据上训练,并在隐私保护保障下供公众使用。