As social media grows faster, harassment becomes more prevalent which leads to considered fake detection a fascinating field among researchers. The graph nature of data with the large number of nodes caused different obstacles including a considerable amount of unrelated features in matrices as high dispersion and imbalance classes in the dataset. To deal with these issues Auto-encoders and a combination of semi-supervised learning and the GAN algorithm which is called SGAN were used. This paper is deploying a smaller number of labels and applying SGAN as a classifier. The result of this test showed that the accuracy had reached 91\% in detecting fake accounts using only 100 labeled samples.
翻译:随着社交媒体的快速发展,网络欺凌日益普遍,使得虚假检测成为研究者们关注的热门领域。数据具有图结构特性且节点数量庞大,带来了诸多挑战,包括矩阵中大量无关特征导致的高离散性,以及数据集中类别分布不平衡等问题。为应对这些问题,本文采用了自编码器以及结合半监督学习与GAN算法的SGAN(半监督生成对抗网络)方法。本研究利用少量标注样本,将SGAN作为分类器进行应用。实验结果显示,仅使用100个标注样本,该方法在虚假账户检测中的准确率便达到了91%。