Modeling long-range dependencies in sequential data remains a central challenge in machine learning. Transformers address this challenge through attention mechanisms, but their quadratic complexity with respect to sequence length limits scalability to long contexts. State-space models (SSMs) provide an efficient alternative with linear-time computation by evolving a latent state through recurrent updates, but their memory is typically formed via additive or linear transitions, which can limit their ability to capture complex global interactions across tokens. In this work, we introduce one of the first studies to leverage the superposition property of quantum systems to enhance state-based sequence modeling. In particular, we propose Quantum Long-Attention Memory (QLAM), a hybrid quantum-classical memory mechanism that can be viewed as a quantum extension of state-space models. Instead of maintaining a classical latent state updated through additive dynamics, QLAM represents the hidden state as a quantum state whose amplitudes encode a superposition of historical information. The state evolves through parameterized quantum circuits conditioned on the input, enabling a non-classical, globally update mechanism. In this way, QLAM preserves the recurrent and linear-time structure of SSMs while fundamentally enriching the memory representation through quantum superposition. Unlike attention mechanisms that explicitly compute pairwise interactions, QLAM implicitly captures global dependencies through the evolution of the quantum state, and retrieves task-relevant information via query-dependent measurements. We evaluate QLAM on sequential variants of standard image classification benchmarks, including sMNIST, sFashion-MNIST, and sCIFAR-10, where images are flattened into token sequences. Across all tasks, QLAM consistently improves over recurrent baselines and transformer-based models.
翻译:在序列数据中建模长距离依赖关系仍是机器学习领域的核心挑战。Transformer通过注意力机制应对这一挑战,但其序列长度相关的二次复杂度限制了在长上下文场景下的可扩展性。状态空间模型(SSMs)通过递归更新演化隐状态,以线性时间计算提供高效替代方案,但其记忆通常由加法或线性转换形成,限制了捕捉令牌间复杂全局交互的能力。本文首次系统性探索利用量子系统的叠加特性增强基于状态的序列建模。具体而言,我们提出量子长注意力记忆(QLAM),一种可视为状态空间模型量子扩展的混合量子-经典记忆机制。QLAM不维护通过加法动态更新的经典隐状态,而是将隐藏状态表示为量子态,其振幅编码历史信息的叠加。该状态通过基于输入条件化的参数化量子电路演化,实现非经典全局更新机制。由此,QLAM在保持SSMs递归线性时间结构的同时,通过量子叠加从根本上丰富记忆表征。与显式计算成对交互的注意力机制不同,QLAM通过量子态演化隐式捕获全局依赖关系,并通过查询依赖的测量提取任务相关信息。我们在标准图像分类基准的序列化变体(包括sMNIST、sFashion-MNIST和sCIFAR-10,其中图像被展平为令牌序列)上评估QLAM。在所有任务中,QLAM较之递归基线模型与基于Transformer的模型均实现一致提升。