Mobile applications have become indispensable companions in our daily lives. Spanning over the categories from communication and entertainment to healthcare and finance, these applications have been influential in every aspect. Despite their omnipresence, developing apps that meet user needs and expectations still remains a challenge. Traditional requirements elicitation methods like user interviews can be time-consuming and suffer from limited scope and subjectivity. This research introduces an approach leveraging the power of Large Language Models (LLMs) to analyze user reviews for automated requirements elicitation. We fine-tuned three well-established LLMs BERT, DistilBERT, and GEMMA, on a dataset of app reviews labeled for usefulness. Our evaluation revealed BERT's superior performance, achieving an accuracy of 92.40% and an F1-score of 92.39%, demonstrating its effectiveness in accurately classifying useful reviews. While GEMMA displayed a lower overall performance, it excelled in recall (93.39%), indicating its potential for capturing a comprehensive set of valuable user insights. These findings suggest that LLMs offer a promising avenue for streamlining requirements elicitation in mobile app development, leading to the creation of more user-centric and successful applications.
翻译:移动应用程序已成为我们日常生活中不可或缺的伴侣。从通信娱乐到医疗金融,这些应用已渗透至各个领域。尽管无处不在,开发出满足用户需求和期望的应用程序仍面临挑战。用户访谈等传统需求获取方法耗时较长,且存在范围有限和主观性强的问题。本研究提出一种利用大型语言模型分析用户评论以实现自动化需求获取的方法。我们在标注了有用性的应用评论数据集上对三种成熟的大型语言模型——BERT、DistilBERT和GEMMA进行了微调。评估结果显示BERT表现最优,准确率达到92.40%,F1分数为92.39%,证明其在准确分类有用评论方面的有效性。虽然GEMMA整体性能较低,但其召回率表现突出(93.39%),表明该模型在全面捕捉有价值的用户见解方面具有潜力。这些发现表明,大型语言模型为移动应用开发中的需求获取流程优化提供了可行路径,有助于创建更以用户为中心且更成功的应用程序。