Exploring Technical Debt in Security Questions on Stack Overflow

Background: Software security is crucial to ensure that the users are protected from undesirable consequences such as malware attacks which can result in loss of data and, subsequently, financial loss. Technical Debt (TD) is a metaphor incurred by suboptimal decisions resulting in long-term consequences such as increased defects and vulnerabilities if not managed. Although previous studies have studied the relationship between security and TD, examining their intersection in developers' discussion on Stack Overflow (SO) is still unexplored. Aims: This study investigates the characteristics of security-related TD questions on SO. More specifically, we explore the prevalence of TD in security-related queries, identify the security tags most prone to TD, and investigate which user groups are more aware of TD. Method: We mined 117,233 security-related questions on SO and used a deep-learning approach to identify 45,078 security-related TD questions. Subsequently, we conducted quantitative and qualitative analyses of the collected security-related TD questions, including sentiment analysis. Results: Our analysis revealed that 38% of the security questions on SO are security-related TD questions. The most recurrent tags among the security-related TD questions emerged as "security" and "encryption." The latter typically have a neutral sentiment, are lengthier, and are posed by users with higher reputation scores. Conclusions: Our findings reveal that developers implicitly discuss TD, suggesting developers have a potential knowledge gap regarding the TD metaphor in the security domain. Moreover, we identified the most common security topics mentioned in TD-related posts, providing valuable insights for developers and researchers to assist developers in prioritizing security concerns in order to minimize TD and enhance software security.

翻译：背景：软件安全对于保护用户免受不良后果（如恶意软件攻击，可能导致数据丢失及后续经济损失）至关重要。技术债务（Technical Debt, TD）是一种因次优决策而引发的隐喻，若不加管理，会导致长期后果，如缺陷和漏洞增加。尽管已有研究探讨了安全与TD之间的关系，但在Stack Overflow（SO）上开发者讨论中审视两者的交集仍属空白。目的：本研究调查SO上安全相关TD问题的特征。具体而言，我们探讨安全相关查询中TD的普遍性，识别最易产生TD的安全标签，并调查哪些用户群体对TD更有意识。方法：我们挖掘了SO上117,233个安全相关问题，并使用深度学习方法识别出45,078个安全相关TD问题。随后，我们对收集的安全相关TD问题进行了定量和定性分析，包括情感分析。结果：分析显示，SO上38%的安全问题属于安全相关TD问题。在安全相关TD问题中，最常见的标签为“security”和“encryption”。后者通常情感中性、篇幅较长，并由信誉值较高的用户提出。结论：我们的发现表明，开发者隐含地讨论TD，暗示开发者在安全领域对TD隐喻可能存在知识差距。此外，我们识别了TD相关帖子中最常提及的安全主题，为开发者和研究人员提供了宝贵见解，以帮助开发者优先处理安全关切，从而最小化TD并增强软件安全。