Shedding Light on Software Engineering-specific Metaphors and Idioms

Use of figurative language, such as metaphors and idioms, is common in our daily-life communications, and it can also be found in Software Engineering (SE) channels, such as comments on GitHub. Automatically interpreting figurative language is a challenging task, even with modern Large Language Models (LLMs), as it often involves subtle nuances. This is particularly true in the SE domain, where figurative language is frequently used to convey technical concepts, often bearing developer affect (e.g., `spaghetti code'). Surprisingly, there is a lack of studies on how figurative language in SE communications impacts the performance of automatic tools that focus on understanding developer communications, e.g., bug prioritization, incivility detection. Furthermore, it is an open question to what extent state-of-the-art LLMs interpret figurative expressions in domain-specific communication such as software engineering. To address this gap, we study the prevalence and impact of figurative language in SE communication channels. This study contributes to understanding the role of figurative language in SE, the potential of LLMs in interpreting them, and its impact on automated SE communication analysis. Our results demonstrate the effectiveness of fine-tuning LLMs with figurative language in SE and its potential impact on automated tasks that involve affect. We found that, among three state-of-the-art LLMs, the best improved fine-tuned versions have an average improvement of 6.66% on a GitHub emotion classification dataset, 7.07% on a GitHub incivility classification dataset, and 3.71% on a Bugzilla bug report prioritization dataset.

翻译：隐喻和习语等比喻性语言在日常交流中十分常见，在GitHub评论等软件工程（SE）渠道中同样普遍存在。即使采用现代大语言模型（LLM），自动解析比喻性语言仍是一项艰巨任务，因其常涉及微妙语义。这在软件工程领域尤为突出——比喻性语言频繁用于传达技术概念，往往承载开发者情感（如"意大利面条式代码"）。令人惊讶的是，目前鲜有研究探讨SE交流中比喻性语言对自动理解工具（如缺陷优先级排序、不文明检测）性能的影响。此外，顶尖LLM在解读领域特定交流（如软件工程）中的比喻表达时能达到何种程度仍属未解之谜。为填补这一空白，我们系统研究了SE交流渠道中比喻性语言的普遍性及其影响。本研究通过揭示比喻性语言在SE中的作用、LLM解读此类语言的能力及其对自动化SE交流分析的影响，主要发现如下：首先，针对SE领域特性微调LLM能有效提升其解析比喻性语言的能力；其次，该策略对涉及情感的自动化任务具有潜在影响。实验表明，在三种主流LLM中，经微调的最佳版本在GitHub情感分类数据集上平均提升6.66%，在GitHub不文明检测数据集上提升7.07%，在Bugzilla缺陷报告优先级排序数据集上提升3.71%。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日