During times of crisis, social media platforms play a vital role in facilitating communication and coordinating resources. Amidst chaos and uncertainty, communities often rely on these platforms to share urgent pleas for help, extend support, and organize relief efforts. However, the sheer volume of conversations during such periods, which can escalate to unprecedented levels, necessitates the automated identification and matching of requests and offers to streamline relief operations. This study addresses the challenge of efficiently identifying and matching assistance requests and offers on social media platforms during emergencies. We propose CReMa (Crisis Response Matcher), a systematic approach that integrates textual, temporal, and spatial features for multi-lingual request-offer matching. By leveraging CrisisTransformers, a set of pre-trained models specific to crises, and a cross-lingual embedding space, our methodology enhances the identification and matching tasks while outperforming strong baselines such as RoBERTa, MPNet, and BERTweet, in classification tasks, and Universal Sentence Encoder, Sentence Transformers in crisis embeddings generation tasks. We introduce a novel multi-lingual dataset that simulates scenarios of help-seeking and offering assistance on social media across the 16 most commonly used languages in Australia. We conduct comprehensive cross-lingual experiments across these 16 languages, also while examining trade-offs between multiple vector search strategies and accuracy. Additionally, we analyze a million-scale geotagged global dataset to comprehend patterns in relation to seeking help and offering assistance on social media. Overall, these contributions advance the field of crisis informatics and provide benchmarks for future research in the area.
翻译:在危机时期,社交媒体平台在促进沟通和协调资源方面发挥着关键作用。在混乱和不确定的环境中,社群通常依赖这些平台分享紧急求助、提供支持并组织救援工作。然而,此类期间对话量可能激增至前所未有的水平,因此需要自动识别和匹配请求与提供信息以简化救援行动。本研究旨在解决紧急情况下社交媒体平台上求助与提供信息的有效识别和匹配挑战。我们提出CReMa(危机响应匹配器),一种整合文本、时间和空间特征的系统方法,用于多语言请求-提供匹配。通过利用CrisisTransformers(一组针对危机场景的预训练模型)和跨语言嵌入空间,我们的方法增强了识别和匹配任务,同时在分类任务中优于RoBERTa、MPNet和BERTweet等强基线模型,并在危机嵌入生成任务中超越Universal Sentence Encoder和Sentence Transformers。我们引入了一个新颖的多语言数据集,模拟了澳大利亚最常用的16种语言在社交媒体上寻求和提供帮助的场景。我们针对这16种语言进行了全面的跨语言实验,同时考察了多种向量搜索策略与准确性之间的权衡。此外,我们分析了一个百万规模的地理标记全球数据集,以理解社交媒体上寻求和提供帮助的模式。总体而言,这些贡献推进了危机信息学领域的发展,并为该领域的未来研究提供了基准。