Psycholinguistic Analyses in Software Engineering Text: A Systematic Literature Review

Context: A deeper understanding of human factors in software engineering (SE) is essential for improving team collaboration, decision-making, and productivity. Communication channels like code reviews and chats provide insights into developers' psychological and emotional states. While large language models excel at text analysis, they often lack transparency and precision. Psycholinguistic tools like Linguistic Inquiry and Word Count (LIWC) offer clearer, interpretable insights into cognitive and emotional processes exhibited in text. Despite its wide use in SE research, no comprehensive review of LIWC's use has been conducted. Objective: We examine the importance of psycholinguistic tools, particularly LIWC, and provide a thorough analysis of its current and potential future applications in SE research. Methods: We conducted a systematic review of six prominent databases, identifying 43 SE-related papers using LIWC. Our analysis focuses on five research questions. Results: Our findings reveal a wide range of applications, including analyzing team communication to detect developer emotions and personality, developing ML models to predict deleted Stack Overflow posts, and more recently comparing AI-generated and human-written text. LIWC has been primarily used with data from project management platforms (e.g., GitHub) and Q&A forums (e.g., Stack Overflow). Key BSE concepts include Communication, Organizational Climate, and Positive Psychology. 26 of 43 papers did not formally evaluate LIWC. Concerns were raised about some limitations, including difficulty handling SE-specific vocabulary. Conclusion: We highlight the potential of psycholinguistic tools and their limitations, and present new use cases for advancing the research of human factors in SE (e.g., bias in human-LLM conversations).

翻译：背景：深入理解软件工程（SE）中的人因对于改善团队协作、决策制定和生产力至关重要。代码审查和聊天等沟通渠道为开发者的心理和情绪状态提供了洞察。虽然大语言模型在文本分析方面表现出色，但它们通常缺乏透明度和精确性。诸如语言查询与词数统计（LIWC）等心理语言学工具，能够为文本中展现的认知和情绪过程提供更清晰、可解释的洞察。尽管LIWC在SE研究中被广泛使用，但目前尚未有对其应用进行全面综述的研究。目标：我们探讨心理语言学工具（特别是LIWC）的重要性，并对其在SE研究中的当前及未来潜在应用进行全面分析。方法：我们对六个重要数据库进行了系统性综述，识别出43篇使用LIWC的SE相关论文。我们的分析聚焦于五个研究问题。结果：我们的研究发现其应用范围广泛，包括分析团队沟通以检测开发者情绪与人格、开发机器学习模型以预测被删除的Stack Overflow帖子，以及最近用于比较AI生成与人类撰写的文本。LIWC主要与项目管理平台（如GitHub）和问答论坛（如Stack Overflow）的数据结合使用。关键的BSE概念包括沟通、组织氛围和积极心理学。43篇论文中有26篇未对LIWC进行正式评估。研究也指出了其一些局限性，包括处理SE特定词汇的困难。结论：我们强调了心理语言学工具的潜力及其局限性，并提出了推进SE人因研究的新用例（例如，人-LLM对话中的偏见）。

相关内容

Engineering

关注 6

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering