Context: Logging is a crucial practice in software engineering, aiding developers in debugging applications when errors occur. While existing research has explored logging challenges from an academic perspective through literature reviews and source code analysis, a comprehensive study from the practitioners' perspective remains lacking. Objective: This paper aims to bridge this knowledge gap by presenting an in-depth analysis of trends, topics, and challenges in logging based on a dataset of 216,094 posts from Stack Overflow (SO), a popular Q\&A platform for developers. Method: We analyzed longitudinal trends by examining metadata related to users, questions, and tags associated with logging discussions. To identify prevalent discussion topics, we employed a Large Language Model (LLM)--based classification approach, based on a manually validated ground-truth sample. Topic popularity was assessed through average scores and views, while difficulty was measured using three community-driven metrics: the proportion of questions without accepted answers, the proportion of unanswered questions, and the median time to receive an accepted answer. Results: Our analysis identifies 11 distinct topics, with the top three (General Logging Practices, Error Handling and Debugging, and Logging Levels and Output) accounting for over 70\% of all logging-related discussions. Notably, Logging in Containerized Environments emerged as the most difficult topic: 64.9\% of its questions lack an accepted answer, and its median resolution time is among the highest. These findings highlight enduring practitioner struggles with logging in Docker or other containerized environments and the integration of logging pipelines into orchestrators such as Kubernetes and cloud environments. Conclusion: This study sheds light on the practical challenges of logging and provides actionable insights for developers, framework vendors, researchers, and educators.
翻译:背景:日志记录是软件工程中的关键实践,有助于开发人员在错误发生时调试应用程序。现有研究通过文献综述和源代码分析从学术视角探索了日志记录面临的挑战,但缺乏从从业者角度进行的全面研究。目的:本文旨在基于Stack Overflow(SO)这一开发者常用问答平台上的216,094篇帖子数据集,通过深度分析日志记录的趋势、主题和挑战来填补这一知识空白。方法:我们通过分析用户、问题及与日志讨论相关的标签元数据来追踪纵向趋势。为识别主要讨论主题,我们采用基于大语言模型(LLM)的分类方法,并基于人工验证的标注样本进行校准。主题热度通过平均评分和浏览量评估,而难度则采用三项社区驱动指标衡量:无采纳答案的问题比例、未回答的问题比例,以及获得采纳答案的中位时间。结果:分析识别出11个不同主题,其中前三位(通用日志实践、错误处理与调试、日志级别与输出)占据所有日志相关讨论的70%以上。值得注意的是,"容器化环境中的日志记录"成为最困难的主题:其中64.9%的问题缺乏采纳答案,且其中位解决时间位居前列。这些发现揭示了从业者在Docker或其他容器化环境中的日志实践困境,以及将日志流水线集成到Kubernetes等编排器及云环境中的持续挑战。结论:本研究揭示了日志记录的实际挑战,为开发者、框架供应商、研究人员和教育工作者提供了可行的见解。