Nowadays, neural network (NN) and deep learning (DL) techniques are widely adopted in many applications, including recommender systems. Given the sparse and stochastic nature of collaborative filtering (CF) data, recent works have critically analyzed the effective improvement of neural-based approaches compared to simpler and often transparent algorithms for recommendation. Previous results showed that NN and DL models can be outperformed by traditional algorithms in many tasks. Moreover, given the largely black-box nature of neural-based methods, interpretable results are not naturally obtained. Following on this debate, we first present a transparent probabilistic model that topologically organizes user and product latent classes based on the review information. In contrast to popular neural techniques for representation learning, we readily obtain a statistical, visualization-friendly tool that can be easily inspected to understand user and product characteristics from a textual-based perspective. Then, given the limitations of common embedding techniques, we investigate the possibility of using the estimated interpretable quantities as model input for a rating prediction task. To contribute to the recent debates, we evaluate our results in terms of both capacity for interpretability and predictive performances in comparison with popular text-based neural approaches. The results demonstrate that the proposed latent class representations can yield competitive predictive performances, compared to popular, but difficult-to-interpret approaches.
翻译:当前,神经网络与深度学习技术已在包括推荐系统在内的众多应用中得到广泛采用。鉴于协同过滤数据具有稀疏性与随机性,近期研究对基于神经的方法相较于更简单且通常透明的推荐算法的实际改进效果进行了批判性分析。先前结果表明,在许多任务中传统算法可能优于神经网络与深度学习模型。此外,由于基于神经的方法大多具有黑箱特性,其可解释性结果并非自然获得。延续这一讨论,我们首先提出一种透明概率模型,该模型基于评论信息在拓扑结构上组织用户与产品的潜在类别。与流行的表示学习神经技术相比,我们直接获得了一种统计性、可视化友好的工具,可轻松检视以从文本角度理解用户与产品特征。随后,针对常见嵌入技术的局限性,我们探讨了将估计的可解释量作为评分预测任务模型输入的可能性。为参与近期讨论,我们从可解释能力与预测性能两方面评估结果,并与流行的基于文本的神经方法进行比较。结果表明,相较于流行但难以解释的方法,所提出的潜在类别表示能够产生具有竞争力的预测性能。