Analyzing Trendy Twitter Hashtags in the 2022 French Election

Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often prohibitive when they run on enormous data-sets. Some studies have shown that simple semantic network features can be rich enough to use for regressions without requiring complex computations. We propose a method for using semantic networks as user-level features for machine learning tasks. We conducted an experiment using a semantic network of 1037 Twitter hashtags from a corpus of 3.7 million tweets related to the 2022 French presidential election. A bipartite graph is formed where hashtags are nodes and weighted edges connect the hashtags reflecting the number of Twitter users that interacted with both hashtags. The graph is then transformed into a maximum-spanning tree with the most popular hashtag as its root node to construct a hierarchy amongst the hashtags. We then provide a vector feature for each user based on this tree. To validate the usefulness of our semantic feature we performed a regression experiment to predict the response rate of each user with six emotions like anger, enjoyment, or disgust. Our semantic feature performs well with the regression with most emotions having $R^2$ above 0.5. These results suggest that our semantic feature could be considered for use in further experiments predicting social media response on big data-sets.

翻译：用于预测社交媒体用户未来活动的回归模型需要丰富的特征以提高预测准确性。尽管现有多种高级模型可生成此类特征，但其计算时间复杂度在处理海量数据集时往往过高。部分研究表明，简单的语义网络特征足以构建回归模型，且无需复杂计算。我们提出一种将语义网络作为用户级特征用于机器学习任务的方法。基于2022年法国总统选举相关的370万条推文语料库，我们构建了一个包含1037个Twitter话题标签的语义网络实验。该网络采用二分图结构，节点为话题标签，加权边连接反映同时与两个话题标签互动的用户数量。随后将该图转化为以最热门话题标签为根节点的最大生成树，以在话题标签间构建层次关系。基于该树形结构，我们为每位用户生成向量特征。为验证语义特征的有效性，我们通过回归实验预测每位用户对愤怒、愉悦、厌恶等六种情绪的反应率。该语义特征在回归中表现良好，多数情绪的R²值超过0.5。结果表明，该语义特征可考虑应用于大数据集上社交媒体响应预测的后续实验。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日