We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)--low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.
翻译:我们提出量子启发的张量自适应(QuanTA),这是一种新颖、易于实现且无推理开销的大规模预训练语言模型微调方法。通过利用源自量子电路结构的量子启发方法,QuanTA实现了高效的高秩微调,突破了低秩自适应(LoRA)的局限性——低秩近似在处理复杂下游任务时可能失效。我们的方法在理论上得到通用性定理和秩表示定理的支持,以实现高效的高秩自适应。实验表明,与传统方法相比,QuanTA在常识推理、算术推理和可扩展性方面均有显著提升。此外,与其他方法相比,QuanTA以更少的可训练参数展现出更优的性能,并且可设计为与现有微调算法集成以进一步提升效果,为大规模语言模型的微调提供了可扩展且高效的解决方案,推动了自然语言处理领域的技术前沿。