Support vector machines (SVMs) are widely used machine learning models (e.g., in remote sensing), with formulations for both classification and regression tasks. In the last years, with the advent of working quantum annealers, hybrid SVM models characterised by quantum training and classical execution have been introduced. These models have demonstrated comparable performance to their classical counterparts. However, they are limited in the training set size due to the restricted connectivity of the current quantum annealers. Hence, to take advantage of large datasets (like those related to Earth observation), a strategy is required. In the classical domain, local SVMs, namely, SVMs trained on the data samples selected by a k-nearest neighbors model, have already proven successful. Here, the local application of quantum-trained SVM models is proposed and empirically assessed. In particular, this approach allows overcoming the constraints on the training set size of the quantum-trained models while enhancing their performance. In practice, the FaLK-SVM method, designed for efficient local SVMs, has been combined with quantum-trained SVM models for binary and multiclass classification. In addition, for comparison, FaLK-SVM has been interfaced for the first time with a classical single-step multiclass SVM model (CS SVM). Concerning the empirical evaluation, D-Wave's quantum annealers and real-world datasets taken from the remote sensing domain have been employed. The results have shown the effectiveness and scalability of the proposed approach, but also its practical applicability in a real-world large-scale scenario.
翻译:支持向量机广泛用于机器学习模型(例如在遥感领域),具有适用于分类和回归任务的公式。近年来,随着可工作的量子退火器的出现,引入了以量子训练和经典执行为特征的混合SVM模型。这些模型已表现出与经典模型相当的性能。然而,由于当前量子退火器连接性受限,它们在训练集规模上受到限制。因此,为了利用大规模数据集(如与地球观测相关的数据集),需要一种策略。在经典领域,局部SVM(即在由k-近邻模型选择的训练样本上训练的SVM)已被证明是成功的。本文提出并实证评估了量子训练SVM模型的局部应用。具体而言,该方法能够克服量子训练模型在训练集规模上的限制,同时提升其性能。实际上,针对高效局部SVM设计的FaLK-SVM方法已与用于二分类和多分类的量子训练SVM模型相结合。此外,为进行比较,FaLK-SVM首次与经典单步多分类SVM模型进行了接口对接。关于实证评估,采用了D-Wave的量子退火器以及来自遥感领域的真实世界数据集。结果证明了所提方法的有效性和可扩展性,及其在真实大规模场景中的实际适用性。