We explore the understudied area of social payments to evaluate whether or not we can predict the gender and political affiliation of Venmo users based on the content of their Venmo transactions. Latent attribute detection has been successfully applied in the domain of studying social media. However, there remains a dearth of previous work using data other than Twitter. There is also a continued need for studies which explore mobile payments spaces like Venmo, which remain understudied due to the lack of data access. We hypothesize that using methods similar to latent attribute analysis with Twitter data, machine learning algorithms will be able to predict gender and political affiliation of Venmo users with a moderate degree of accuracy. We collected crowdsourced training data that correlates participants' political views with their public Venmo transaction history through the paid Prolific service. Additionally, we collected 21 million public Venmo transactions from recently active users to use for gender classification. We then ran the collected data through a TF-IDF vectorizer and used that to train a support vector machine (SVM). After hyperparameter training and additional feature engineering, we were able to predict user's gender with a high level of accuracy (.91) and had modest success predicting user's political orientation (.63).
翻译:我们探索了社交支付这一尚未充分研究的领域,以评估能否基于Venmo用户的交易内容预测其性别与政治倾向。潜在属性检测已在社交媒体研究领域取得成功应用,然而目前鲜有研究使用除Twitter以外的数据,且针对Venmo等移动支付空间的研究仍存在持续需求——这些领域因数据获取限制而长期缺乏探索。我们假设,采用类似Twitter数据潜在属性分析的方法,机器学习算法能够以中等准确率预测Venmo用户的性别与政治倾向。我们通过付费的Prolific服务平台收集了众包训练数据,将参与者的政治观点与其公开的Venmo交易记录相关联。此外,我们还采集了2100万条近期活跃用户的公开Venmo交易数据用于性别分类。随后,我们对收集的数据应用TF-IDF向量化处理,并训练支持向量机模型。经过超参数调优与额外特征工程,我们实现了高准确率(0.91)的性别预测,并在政治倾向预测中取得中等成效(0.63)。