App reviews reflect various user requirements that can aid in planning maintenance tasks. Recently, proposed approaches for automatically classifying user reviews rely on machine learning algorithms. A previous study demonstrated that models trained on existing labeled datasets exhibit poor performance when predicting new ones. Therefore, a comprehensive labeled dataset is essential to train a more precise model. In this paper, we propose a novel approach that assists in augmenting labeled datasets by utilizing information extracted from an additional source, GitHub issues, that contains valuable information about user requirements. First, we identify issues concerning review intentions (bug reports, feature requests, and others) by examining the issue labels. Then, we analyze issue bodies and define 19 language patterns for extracting targeted information. Finally, we augment the manually labeled review dataset with a subset of processed issues through the Within-App, Within-Context, and Between-App Analysis methods. We conducted several experiments to evaluate the proposed approach. Our results demonstrate that using labeled issues for data augmentation can improve the F1-score to 6.3 in bug reports and 7.2 in feature requests. Furthermore, we identify an effective range of 0.3 to 0.7 for the auxiliary volume, which provides better performance improvements.
翻译:应用评论反映了各类用户需求,有助于规划维护任务。近期提出的用户评论自动分类方法依赖于机器学习算法。先前研究表明,基于现有标注数据集训练的模型在预测新数据时表现不佳。因此,需要构建全面的标注数据集以训练更精确的模型。本文提出一种新颖方法,通过利用额外数据源(GitHub issues)中提取的信息来辅助扩充标注数据集,该数据源包含有价值的用户需求信息。首先,我们通过检查issue标签识别涉及评论意图(错误报告、功能请求及其他类型)的issues。随后,分析issue正文并定义19种语言模式以提取目标信息。最后,通过应用内分析、上下文内分析和跨应用分析三种方法,将处理后的issues子集与人工标注的评论数据集进行融合扩充。我们进行了多组实验以评估所提方法。结果表明:使用标注issues进行数据增强可将错误报告的F1值提升6.3,功能请求的F1值提升7.2。此外,我们确定了辅助数据量的有效区间为0.3至0.7,该区间能带来更显著的性能提升。