Real-time social media data can provide useful information on evolving hazards. Alongside traditional methods of disaster detection, the integration of social media data can considerably enhance disaster management. In this paper, we investigate the problem of detecting geolocation-content communities on Twitter and propose a novel distributed system that provides in near real-time information on hazard-related events and their evolution. We show that content-based community analysis leads to better and faster dissemination of reports on hazards. Our distributed disaster reporting system analyzes the social relationship among worldwide geolocated tweets, and applies topic modeling to group tweets by topics. Considering for each tweet the following information: user, timestamp, geolocation, retweets, and replies, we create a publisher-subscriber distribution model for topics. We use content similarity and the proximity of nodes to create a new model for geolocation-content based communities. Users can subscribe to different topics in specific geographical areas or worldwide and receive real-time reports regarding these topics. As misinformation can lead to increase damage if propagated in hazards related tweets, we propose a new deep learning model to detect fake news. The misinformed tweets are then removed from display. We also show empirically the scalability capabilities of the proposed system.
翻译:实时社交媒体数据可以提供有关正在演变灾害的有用信息。结合传统灾害检测方法,整合社交媒体数据可以显著提升灾害管理能力。本文研究了推特上地理定位-内容社区检测问题,并提出了一种新型分布式系统,可近乎实时地提供与灾害相关事件及其演变的信息。研究表明,基于内容的社区分析能够更快、更优地传播灾害报告。我们的分布式灾害报告系统分析了全球地理定位推文之间的社会关系,并应用主题建模将推文按主题分组。考虑到每条推文的以下信息:用户、时间戳、地理定位、转发和回复,我们创建了主题的发布-订阅分布模型。利用内容相似性与节点邻近度,我们提出了基于地理定位-内容社区的新模型。用户可订阅特定地理区域或全球范围内的不同主题,并接收关于这些主题的实时报告。由于灾害相关推文中传播的虚假信息可能加剧损害,我们提出了一种新型深度学习模型来检测假新闻,并将被识别为虚假信息的推文从展示中移除。实验还证明了所提系统的可扩展性。