In recent years, reports and anecdotal evidence pointing at the role of WhatsApp in a variety of events, ranging from elections to collective violence, have emerged. While academic research should examine the validity of these claims, obtaining WhatsApp data for research is notably challenging, contrasting with the relative abundance of data from platforms like Facebook and Twitter, where user "information diets" have been extensively studied. This lack of data is particularly problematic since misinformation and hate speech are major concerns in the set of Global South countries in which WhatsApp dominates the market for messaging. To help make research on these questions, and more generally research on WhatsApp, possible, this paper introduces WhatsApp Explorer, a tool designed to enable WhatsApp data collection on a large scale. We discuss protocols for data collection, including potential sampling approaches, and explain why our tool (and adjoining protocol) arguably allow researchers to collect WhatsApp data in an ethical and legal manner, at scale.
翻译:近年来,从选举到集体暴力等各类事件中,关于WhatsApp所扮演角色的报道和轶事证据层出不穷。虽然学术研究应检验这些说法的真实性,但与Facebook和Twitter等平台数据相对丰富(其用户的"信息食谱"已被广泛研究)形成对比的是,获取用于研究的WhatsApp数据却异常困难。这种数据匮乏问题尤为棘手,因为在WhatsApp主导即时通讯市场的全球南方国家群中,虚假信息与仇恨言论是主要关切。为促使对这些议题乃至更广泛的WhatsApp研究成为可能,本文介绍了WhatsApp Explorer——一款旨在支持大规模收集WhatsApp数据的工具。我们讨论了数据采集协议(包括可行的抽样方法),并阐释为何我们的工具(及配套协议)能够使研究人员以符合伦理法律的方式大规模收集WhatsApp数据。