One-class classification (OCC) is a classification problem in which the training data contains only one class. The one-class support vector machine (OCSVM) is one of the most competitive OCC algorithms. However, OCSVM has scalability issues with large-scale datasets. This paper proposes the acceleration strategy of OCSVM. The idea is to decompose the dataset into samples and train OCSVM models for single data points. Subsequently, ensemble learning is applied to combine all models to compute the OCSVM model for the dataset. In addition, further acceleration is achieved through a data-reduction strategy with an OCSVM model trained on the average of the training samples. The experiment compared the proposal and traditional OCSVM using the Python package. The proposed strategy is faster than traditional OCSVM, while achieving similar classification results. Moreover, the proposed strategy can create one-to-one correspondence between samples and models. Source code is uploaded at https://github.com/ToshiHayashi/ODSVM
翻译:单类分类(OCC)是一类训练数据仅包含单一类别的分类问题。单类支持向量机(OCSVM)是最具竞争力的单类分类算法之一,但在处理大规模数据集时存在可扩展性问题。本文提出OCSVM的加速策略,其核心思想是将数据集分解为单个样本,并为每个数据点训练OCSVM模型,随后通过集成学习合并所有模型以计算数据集的OCSVM模型。此外,通过基于训练样本均值训练OCSVM模型的数据约简策略实现进一步加速。实验使用Python包将所提策略与传统OCSVM进行对比,结果显示该方法在保持相似分类效果的同时,计算速度显著优于传统OCSVM。同时,该策略可建立样本与模型之间的一一对应关系。源代码已上传至https://github.com/ToshiHayashi/ODSVM