Models based on vision transformer architectures are considered state-of-the-art when it comes to image classification tasks. However, they require extensive computational resources both for training and deployment. The problem is exacerbated as the amount and complexity of the data increases. Quantum-based vision transformer models could potentially alleviate this issue by reducing the training and operating time while maintaining the same predictive power. Although current quantum computers are not yet able to perform high-dimensional tasks yet, they do offer one of the most efficient solutions for the future. In this work, we construct several variations of a quantum hybrid vision transformer for a classification problem in high energy physics (distinguishing photons and electrons in the electromagnetic calorimeter). We test them against classical vision transformer architectures. Our findings indicate that the hybrid models can achieve comparable performance to their classical analogues with a similar number of parameters.
翻译:基于视觉Transformer架构的模型被认为是图像分类任务中最先进的方法,但它们在训练和部署过程中需要大量计算资源。随着数据量和复杂性的增加,这一问题愈发严重。基于量子计算的视觉Transformer模型通过减少训练和运行时间,同时保持相同的预测能力,有望缓解这一挑战。尽管当前量子计算机尚无法处理高维度任务,但仍为解决未来问题提供了最高效的途径之一。本研究针对高能物理中的分类问题(区分电磁量能器中的光子和电子),构建了多种混合量子视觉Transformer变体,并与经典视觉Transformer架构进行了对比测试。结果表明,在参数数量相近的情况下,混合模型能够达到与经典模型相当的性能。