Mining and analysis of the big data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text analysis of tweets related to Ebola, E-Coli, Dengue, Human Papillomavirus, Middle East Respiratory Syndrome, Measles, Zika virus, H1N1, influenza like illness, swine flu, flu, Cholera, Listeriosis, cancer, Liver Disease, Inflammatory Bowel Disease, kidney disease, lupus, Parkinsons, Diphtheria, and West Nile virus. The recent outbreaks of COVID-19 and MPox have served as catalysts for Twitter usage related to seeking and sharing information, views, opinions, and sentiments involving both of these viruses. None of the prior works in this field analyzed tweets focusing on both COVID-19 and MPox simultaneously. To address this research gap, a total of 61,862 tweets that focused on MPox and COVID-19 simultaneously, posted between 7 May 2022 and 3 March 2023, were studied. The findings and contributions of this study are manifold. First, the results of sentiment analysis using the VADER approach show that nearly half the tweets had a negative sentiment. It was followed by tweets that had a positive sentiment and tweets that had a neutral sentiment, respectively. Second, this paper presents the top 50 hashtags used in these tweets. Third, it presents the top 100 most frequently used words in these tweets after performing tokenization, removal of stopwords, and word frequency analysis. Finally, a comprehensive comparative study that compares the contributions of this paper with 49 prior works in this field is presented to further uphold the relevance and novelty of this work.
翻译:近年来,从关于埃博拉、大肠杆菌、登革热、人乳头瘤病毒、中东呼吸综合征、麻疹、寨卡病毒、H1N1流感、流感样疾病、猪流感、流感、霍乱、李斯特菌病、癌症、肝病、炎症性肠病、肾脏疾病、狼疮、帕金森病、白喉及西尼罗病毒相关推文的情感分析及其他文本分析研究中可见,挖掘和分析Twitter对话大数据已成为医疗、流行病学、大数据、数据科学、计算机科学及相关领域科学界的重要研究方向。近期COVID-19和MPox的爆发进一步推动了Twitter上针对这两种病毒的信息搜索、观点表达、意见交流及情感分享活动。然而,现有研究尚未同时分析涉及COVID-19和MPox的推文。为填补这一研究空白,本研究分析了2022年5月7日至2023年3月3日期间发布的61,862条同时涉及MPox和COVID-19的推文。本研究取得多项发现与贡献:首先,采用VADER方法进行情感分析的结果显示,近半数推文呈现负面情感,其次为正面情感推文,中性情感推文占比最少;其次,本文列出了这些推文中使用频率最高的50个主题标签;第三,通过分词、停用词去除及词频分析,呈现了这些推文中出现频率最高的100个词汇;最后,通过将本文贡献与该领域49项现有研究进行综合比较,进一步凸显了本研究的相关性与创新性。