Automated Summarization of Stack Overflow Posts

Software developers often resort to Stack Overflow (SO) to fill their programming needs. Given the abundance of relevant posts, navigating them and comparing different solutions is tedious and time-consuming. Recent work has proposed to automatically summarize SO posts to concise text to facilitate the navigation of SO posts. However, these techniques rely only on information retrieval methods or heuristics for text summarization, which is insufficient to handle the ambiguity and sophistication of natural language. This paper presents a deep learning based framework called ASSORT for SO post summarization. ASSORT includes two complementary learning methods, ASSORT_S and ASSORT_{IS}, to address the lack of labeled training data for SO post summarization. ASSORT_S is designed to directly train a novel ensemble learning model with BERT embeddings and domainspecific features to account for the unique characteristics of SO posts. By contrast, ASSORT_{IS} is designed to reuse pre-trained models while addressing the domain shift challenge when no training data is present (i.e., zero-shot learning). Both ASSORT_S and ASSORT_{IS} outperform six existing techniques by at least 13% and 7% respectively in terms of the F1 score. Furthermore, a human study shows that participants significantly preferred summaries generated by ASSORT_S and ASSORT_{IS} over the best baseline, while the preference difference between ASSORT_S and ASSORT_{IS} was small.

翻译：软件开发人员经常依赖 Stack Overflow (SO) 来满足编程需求。由于相关帖子数量庞大，浏览和比较不同解决方案既繁琐又耗时。近期研究提出通过自动摘要技术将 SO 帖子生成简洁文本，以促进其导航。然而，现有技术仅依赖信息检索方法或基于启发式的文本摘要策略，难以处理自然语言的歧义性和复杂性。本文提出一种基于深度学习的框架 ASSORT，用于 SO 帖子的摘要生成。ASSORT 包含两种互补的学习方法：ASSORT_S 和 ASSORT_{IS}，以应对 SO 帖子摘要生成中标注训练数据匮乏的问题。ASSORT_S 旨在直接训练一种结合 BERT 嵌入和领域特定特征的新型集成学习模型，以捕捉 SO 帖子的独特特性。相比之下，ASSORT_{IS} 设计用于在无训练数据（即零样本学习）场景下复用预训练模型，同时解决领域迁移挑战。在 F1 分数方面，ASSORT_S 和 ASSORT_{IS} 分别相比六种现有技术至少提升 13% 和 7%。此外，人工研究表明，参与者显著偏好 ASSORT_S 和 ASSORT_{IS} 生成的摘要（优于最佳基线方法），而 ASSORT_S 与 ASSORT_{IS} 间的偏好差异较小。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Meta最新WWW2022《联邦计算导论》教程，附77页ppt

专知会员服务

60+阅读 · 2022年5月5日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日