Sentiment analysis is the process of identifying and categorizing people's emotions or opinions regarding various topics. The analysis of Twitter sentiment has become an increasingly popular topic in recent years. In this paper, we present several machine learning and a deep learning model to analysis sentiment of Persian political tweets. Our analysis was conducted using Bag of Words and ParsBERT for word representation. We applied Gaussian Naive Bayes, Gradient Boosting, Logistic Regression, Decision Trees, Random Forests, as well as a combination of CNN and LSTM to classify the polarities of tweets. The results of this study indicate that deep learning with ParsBERT embedding performs better than machine learning. The CNN-LSTM model had the highest classification accuracy with 89 percent on the first dataset and 71 percent on the second dataset. Due to the complexity of Persian, it was a difficult task to achieve this level of efficiency. The main objective of our research was to reduce the training time while maintaining the model's performance. As a result, several adjustments were made to the model architecture and parameters. In addition to achieving the objective, the performance was slightly improved as well.
翻译:情感分析是识别和分类人们对不同话题情绪或观点的过程。近年来,推特情感分析已成为一个日益热门的研究课题。本文提出了多种机器学习和一种深度学习模型,用于分析波斯语政治推文的情感。我们在词表征方面采用了词袋模型和ParsBERT方法,并应用了高斯朴素贝叶斯、梯度提升、逻辑回归、决策树、随机森林以及CNN与LSTM的组合模型来对推文极性进行分类。研究结果表明,基于ParsBERT嵌入的深度学习模型性能优于传统机器学习方法。CNN-LSTM模型在两个数据集上的分类准确率最高,分别达到89%和71%。由于波斯语的复杂性,实现这一效率水平是一项艰巨的任务。本研究的主要目标是在保持模型性能的同时缩短训练时间。为此,我们对模型架构和参数进行了多项调整。在达成目标的同时,模型性能也获得了轻微提升。