Models based on vision transformer architectures are considered state-of-the-art when it comes to image classification tasks. However, they require extensive computational resources both for training and deployment. The problem is exacerbated as the amount and complexity of the data increases. Quantum-based vision transformer models could potentially alleviate this issue by reducing the training and operating time while maintaining the same predictive power. Although current quantum computers are not yet able to perform high-dimensional tasks yet, they do offer one of the most efficient solutions for the future. In this work, we construct several variations of a quantum hybrid vision transformer for a classification problem in high energy physics (distinguishing photons and electrons in the electromagnetic calorimeter). We test them against classical vision transformer architectures. Our findings indicate that the hybrid models can achieve comparable performance to their classical analogues with a similar number of parameters.
翻译:基于视觉Transformer架构的模型在图像分类任务中被视为最先进的技术,但其训练和部署均需要大量计算资源。随着数据量的增加和复杂度的提升,这一问题愈发严重。量子视觉Transformer模型通过降低训练和运行时间,同时保持相同的预测能力,可能缓解这一困境。尽管当前量子计算机尚无法执行高维任务,但它们为未来提供了最高效的解决方案之一。本研究针对高能物理中的分类问题(在电磁量能器中区分光子和电子),构建了几种量子混合视觉Transformer变体,并与经典视觉Transformer架构进行对比测试。结果表明,在参数数量相近的情况下,混合模型能够达到与经典模型相当的性能。