Selecting techniques is a crucial element of the business analysis approach planning in IT projects. Particular attention is paid to the choice of techniques for requirements elicitation. One of the promising methods for selecting techniques is using machine learning algorithms trained on the practitioners' experience considering different projects' contexts. The effectiveness of ML models is significantly affected by the balance of the training dataset, which is violated in the case of popular techniques. The paper aims to analyze the efficiency of the Synthetic Minority Over-sampling Technique usage in Machine Learning models for elicitation technique selection in case of the imbalanced training dataset and possible ways for positive feature importance selection. The computational experiment results confirmed the effectiveness of using the proposed approaches to improve the accuracy of machine learning models for selecting requirements elicitation techniques. Proposed approaches can be used to build Machine Learning models for business analysis activities planning in IT projects.
翻译:技术选择是IT项目业务分析方法规划中的关键要素,其中需求获取技术的选择尤其受到关注。一种有前景的技术选择方法是利用基于从业者经验训练的机器学习算法,并考虑不同项目的上下文环境。机器学习模型的效果受训练数据集平衡性的显著影响,而常见技术情境下数据集平衡性往往会遭到破坏。本文旨在分析针对不平衡训练数据集,在需求获取技术选择中应用合成少数类过采样技术对机器学习模型效率的影响,以及正特征重要性选择的可能途径。计算实验结果表明,采用所提出的方法能有效提升用于需求获取技术选择的机器学习模型准确性。这些方法可用于构建IT项目中业务分析活动规划的机器学习模型。