In this article we investigate the effects of conformal transformations on kernel functions used in Support Vector Machines. Our focus lies in the task of text document categorization, which involves assigning each document to a particular category. We introduce a new Gaussian Cosine kernel alongside two conformal transformations. Building upon previous studies that demonstrated the efficacy of conformal transformations in increasing class separability on synthetic and low-dimensional datasets, we extend this analysis to the high-dimensional domain of text data. Our experiments, conducted on the Reuters dataset on two types of binary classification tasks, compare the performance of Linear, Gaussian, and Gaussian Cosine kernels against their conformally transformed counterparts. The findings indicate that conformal transformations can significantly improve kernel performance, particularly for sub-optimal kernels. Specifically, improvements were observed in 60% of the tested scenarios for the Linear kernel, 84% for the Gaussian kernel, and 80% for the Gaussian Cosine kernel. In light of these findings, it becomes clear that conformal transformations play a pivotal role in enhancing kernel performance, offering substantial benefits.
翻译:本文研究了共形变换对支持向量机中所用核函数的影响。我们的研究重点在于文本文档分类任务,即把每个文档分配到特定类别。我们引入了一种新的高斯余弦核以及两种共形变换。先前的研究已证明共形变换在合成数据集和低维数据集上能有效提升类间可分性,我们在此基础上将分析扩展到文本数据的高维领域。我们在路透社数据集上针对两类二分类任务进行了实验,比较了线性核、高斯核和高斯余弦核与其经过共形变换的版本之间的性能。研究结果表明,共形变换能显著提升核函数性能,特别是对于次优核函数。具体而言,在测试场景中,线性核的性能在60%的情况下得到改善,高斯核在84%的情况下得到改善,高斯余弦核在80%的情况下得到改善。基于这些发现,可以明确共形变换在提升核函数性能方面发挥着关键作用,能带来显著益处。