App reviews reflect various user requirements that can aid in planning maintenance tasks. Recently, proposed approaches for automatically classifying user reviews rely on machine learning algorithms. A previous study demonstrated that models trained on existing labeled datasets exhibit poor performance when predicting new ones. Therefore, a comprehensive labeled dataset is essential to train a more precise model. In this paper, we propose a novel approach that assists in augmenting labeled datasets by utilizing information extracted from an additional source, GitHub issues, that contains valuable information about user requirements. First, we identify issues concerning review intentions (bug reports, feature requests, and others) by examining the issue labels. Then, we analyze issue bodies and define 19 language patterns for extracting targeted information. Finally, we augment the manually labeled review dataset with a subset of processed issues through the \emph{Within-App}, \emph{Within-Context}, and \emph{Between-App Analysis} methods. We conducted several experiments to evaluate the proposed approach. Our results demonstrate that using labeled issues for data augmentation can improve the F1-score to 6.3 in bug reports and 7.2 in feature requests. Furthermore, we identify an effective range of 0.3 to 0.7 for the auxiliary volume, which provides better performance improvements.
翻译:应用评论反映了多样化的用户需求,这些需求有助于规划维护任务。近期提出的自动分类用户评论方法主要依赖于机器学习算法。先前研究表明,基于现有标注数据集训练的模型在预测新数据时表现欠佳。因此,需要构建全面的标注数据集以训练更精确的模型。本文提出一种创新方法,通过利用额外数据源——GitHub issues(包含用户需求相关宝贵信息)来辅助扩充标注数据集。首先,我们通过检查issue标签识别涉及评论意图(错误报告、功能请求及其他类别)的issues。随后,分析issue正文并定义19种语言模式以提取目标信息。最后,通过\emph{应用内分析}、\emph{上下文内分析}和\emph{跨应用分析}方法,将处理后的issues子集与人工标注的评论数据集进行融合扩充。我们通过多组实验评估所提方法,结果表明:利用标注issues进行数据增强可将错误报告分类的F1值提升6.3,功能请求分类提升7.2。此外,我们确定了辅助数据量的有效区间为0.3至0.7,该区间能带来更优的性能提升。