Recently, convolution neural networks (CNNs) have attracted a great deal of attention due to their remarkable performance in various domains, particularly in image and text classification tasks. However, their application to tabular data classification remains underexplored. There are many fields such as bioinformatics, finance, medicine where nonimage data are prevalent. Adaption of CNNs to classify nonimage data remains highly challenging. This paper investigates the efficacy of CNNs for tabular data classification, aiming to bridge the gap between traditional machine learning approaches and deep learning techniques. We propose a novel framework fuzzy convolution neural network (FCNN) tailored specifically for tabular data to capture local patterns within feature vectors. In our approach, we map feature values to fuzzy memberships. The fuzzy membership vectors are converted into images that are used to train the CNN model. The trained CNN model is used to classify unknown feature vectors. To validate our approach, we generated six complex noisy data sets. We used randomly selected seventy percent samples from each data set for training and thirty percent for testing. The data sets were also classified using the state-of-the-art machine learning algorithms such as the decision tree (DT), support vector machine (SVM), fuzzy neural network (FNN), Bayes classifier, and Random Forest (RF). Experimental results demonstrate that our proposed model can effectively learn meaningful representations from tabular data, achieving competitive or superior performance compared to existing methods. Overall, our finding suggests that the proposed FCNN model holds promise as a viable alternative for tabular data classification tasks, offering a fresh prospective and potentially unlocking new opportunities for leveraging deep learning in structured data analysis.
翻译:近年来,卷积神经网络(CNNs)因其在图像与文本分类等多个领域取得的卓越性能而受到广泛关注。然而,其在表格数据分类任务中的应用仍待深入探索。在生物信息学、金融、医学等诸多领域,非图像数据普遍存在。将CNNs有效适配于非图像数据的分类仍面临巨大挑战。本文研究了CNNs在表格数据分类中的有效性,旨在弥合传统机器学习方法与深度学习技术之间的差距。我们提出了一种新颖的、专门针对表格数据设计的模糊卷积神经网络(FCNN)框架,以捕捉特征向量内的局部模式。在该方法中,我们将特征值映射为模糊隶属度,并将模糊隶属度向量转换为用于训练CNN模型的图像。训练完成的CNN模型用于对未知特征向量进行分类。为验证所提方法,我们生成了六个复杂的含噪声数据集。每个数据集中随机选取70%的样本用于训练,30%用于测试。同时,我们使用决策树(DT)、支持向量机(SVM)、模糊神经网络(FNN)、贝叶斯分类器和随机森林(RF)等前沿机器学习算法对这些数据集进行分类。实验结果表明,与现有方法相比,我们提出的模型能够有效地从表格数据中学习有意义的表征,并取得具有竞争力或更优的性能。总体而言,我们的发现表明,所提出的FCNN模型有望成为表格数据分类任务的一种可行替代方案,为结构化数据分析中的深度学习应用提供了新的视角,并可能开启新的机遇。