Community Needs and Assets: A Computational Analysis of Community Conversations

A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing social media conversations is challenging. There is a gap in the present literature in computationally analyzing how community members discuss the strengths and needs of the community. To address this gap, we introduce the task of identifying, extracting, and categorizing community needs and assets from conversational data using sophisticated natural language processing methods. To facilitate this task, we introduce the first dataset about community needs and assets consisting of 3,511 conversations from Reddit, annotated using crowdsourced workers. Using this dataset, we evaluate an utterance-level classification model compared to sentiment classification and a popular large language model (in a zero-shot setting), where we find that our model outperforms both baselines at an F1 score of 94% compared to 49% and 61% respectively. Furthermore, we observe through our study that conversations about needs have negative sentiments and emotions, while conversations about assets focus on location and entities. The dataset is available at https://github.com/towhidabsar/CommunityNeeds.

翻译：社区需求评估是非营利组织和政府机构用于量化社区优势与问题、以便更合理分配资源的工具。此类方法正逐渐转向利用社交媒体对话来分析社区需求及现有资产。然而，对呈指数增长的社交媒体对话进行人工分析极具挑战性。当前文献在计算分析社区成员如何讨论社区优势与需求方面存在空白。为填补这一空白，我们提出一项新任务：运用先进的自然语言处理方法从对话数据中识别、提取并分类社区需求与资产。为支撑该任务，我们首次构建了包含3,511条Reddit社区对话的数据集，并通过众包方式完成标注。利用该数据集，我们评估了一个话语级分类模型，并与情感分类及一种流行的大语言模型（在零样本场景下）进行对比，结果显示我们的模型以94%的F1分数显著优于两者（基线F1分数分别为49%和61%）。此外，研究发现关于需求的对话呈现负面情绪，而关于资产的对话则聚焦于地点与实体。数据集可于https://github.com/towhidabsar/CommunityNeeds获取。

相关内容

ASSETS

关注 0

ACM SIGACCESS Conference on Computers and Accessibility是为残疾人和老年人提供与计算机相关的设计、评估、使用和教育研究的首要论坛。我们欢迎提交原始的高质量的有关计算和可访问性的主题。今年，ASSETS首次将其范围扩大到包括关于计算机无障碍教育相关主题的原创高质量研究。官网链接：http://assets19.sigaccess.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日