The conflict between Israel and Palestinians significantly escalated after the October 7, 2023 Hamas attack, capturing global attention. To understand the public discourse on this conflict, we present a meticulously compiled dataset--IsamasRed--comprising nearly 400,000 conversations and over 8 million comments from Reddit, spanning from August 2023 to November 2023. We introduce an innovative keyword extraction framework leveraging a large language model to effectively identify pertinent keywords, ensuring a comprehensive data collection. Our initial analysis on the dataset, examining topics, controversy, emotional and moral language trends over time, highlights the emotionally charged and complex nature of the discourse. This dataset aims to enrich the understanding of online discussions, shedding light on the complex interplay between ideology, sentiment, and community engagement in digital spaces.
翻译:2023年10月7日哈马斯袭击事件后,以色列与巴勒斯坦之间的冲突显著升级,引发全球关注。为理解这场冲突中的公共讨论,我们呈现了一个精心编制的数据集——IsamasRed,包含来自Reddit的近40万条讨论和超过800万条评论,时间跨度从2023年8月到2023年11月。我们提出了一种创新的关键词提取框架,利用大型语言模型有效识别相关关键词,确保数据收集的全面性。对该数据集的初步分析涵盖了主题、争议性、情感及道德语言随时间的变化趋势,凸显了讨论中情感复杂且充满争议的特质。该数据集旨在丰富对在线讨论的理解,揭示数字空间中意识形态、情感与社区参与之间的复杂相互作用。