Recently, peoples awareness of online purchases has significantly risen. This has given rise to online retail platforms and the need for a better understanding of customer purchasing behaviour. Retail companies are pressed with the need to deal with a high volume of customer purchases, which requires sophisticated approaches to perform more accurate and efficient customer segmentation. Customer segmentation is a marketing analytical tool that aids customer-centric service and thus enhances profitability. In this paper, we aim to develop a customer segmentation model to improve decision-making processes in the retail market industry. To achieve this, we employed a UK-based online retail dataset obtained from the UCI machine learning repository. The retail dataset consists of 541,909 customer records and eight features. Our study adopted the RFM (recency, frequency, and monetary) framework to quantify customer values. Thereafter, we compared several state-of-the-art (SOTA) clustering algorithms, namely, K-means clustering, the Gaussian mixture model (GMM), density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering, and balanced iterative reducing and clustering using hierarchies (BIRCH). The results showed the GMM outperformed other approaches, with a Silhouette Score of 0.80.
翻译:近年来,人们对在线购物的认知显著提升,这不仅催生了在线零售平台,也凸显了深入理解客户购买行为的必要性。零售企业面临处理海量客户交易数据的压力,这需要采用更复杂的方法来实现更精准高效的客户细分。客户细分是一种以客户为中心、助力提升盈利能力的营销分析工具。本文旨在构建一个客户细分模型,以优化零售行业的决策流程。为此,我们采用了来自UCI机器学习库的英国在线零售数据集,该数据集包含541,909条客户记录和八个特征。研究采用RFM(最近消费时间、消费频率、消费金额)框架量化客户价值,随后比较了多种前沿聚类算法:K-means聚类、高斯混合模型(GMM)、基于密度的噪声应用空间聚类(DBSCAN)、凝聚式聚类以及平衡迭代归约与层次聚类(BIRCH)。实验结果显示,GMM算法表现最优,其轮廓系数达到0.80。